Skip to main content

What Are Custom Evaluations?

While Tuner provides a rich set of pre-defined metrics, custom evaluations allow you to tailor the analysis to your specific needs. You can create evaluations to:
  • Check for specific keywords or phrases: Did the agent mention the daily special?
  • Verify a specific action was taken: Was an appointment successfully booked?
  • Monitor for compliance: Did the agent read the required legal disclaimer?

How to Create a Custom Evaluation

1

Navigate to Evaluation Criteria

In Agent Settings, go to the Evaluation Criteria tab.
2

Create a New Behavior Check

In the Behavior Checks section, click the Add button to open the behavior check modal.
3

Choose a Pre-built or Custom Check

You can either choose from a list of pre-built suggested checks (like Hallucination or System Prompt Compliance) or create your own custom check.
4

Define the Evaluation Criteria

If creating a custom check, you will define:
  • Check Label: A descriptive name for the check.
  • Check Type: Pass/Fail (binary) or Score 1-5 (scaled).
  • Prompt Definition: A clear description of what should pass or fail.
  • Inputs Used: You can reference your System Prompt, Capabilities, or Tools as input for the evaluation.
5

Save and Enable

Save your behavior check. It will now run on all future calls.
Evaluation Builder

Example: Checking for a Disclaimer

Let’s say you want to ensure your agent always reads a legal disclaimer. You could create a custom behavior check with the following criteria:
  • Check Label: “Disclaimer Read”
  • Check Type: Pass/Fail
  • Prompt Definition: “Check if the agent stated the following disclaimer verbatim: ‘This call may be recorded for quality assurance.’”
If a call does not meet this criterion, the check will fail, and you can use this in the Rule Builder to trigger a red flag or an alert.

Next Steps

Learn how to configure real-time alerts to be notified of critical issues.

How to Configure Real-Time Alerts

A step-by-step guide to setting up real-time alerts in Tuner.