Skip to main content

What Are Red Flags?

A red flag is a structured signal that Tuner attaches to a call when one or more rule conditions are satisfied. Unlike evals — which score individual behaviors — red flags represent a composite judgment: a call is flagged when a combination of eval failures, voice metrics, and call classification outcomes cross a threshold you define. The same Rule Builder that creates red flags also creates labels — contextual tags that categorize calls without marking them as failures (e.g. long_agent_monologue, sluggish_conversation). Both types appear in the call logs and are surfaced in the Overview dashboard.

Where Red Flags and Labels Surface

Overview Dashboard

Red flags appear in the Recent Red Flags section of the Overview dashboard, giving you a prioritized queue of calls that need attention without requiring you to manually filter call logs. Recent Red Flags on the Overview dashboard Each entry in the table exposes:
ColumnWhat it tells you
Flag TypeThe name of the rule that triggered — e.g. negative_sentiment, conversation_freeze
Call IDDirect link into the Call Details view for that conversation
Call IntentThe intent Tuner detected for the call, useful for spotting intent-specific failure patterns
OutcomeHow the call was classified — e.g. Resolved, Escalated/transferred
TimestampWhen the call occurred

Call Logs

Both red flags and labels appear as color-coded tags on every call row in the call logs. This lets you filter and sort the full call history by the behavioral patterns you care about. Call log rows showing labels and red flags as color-coded tags

How Red Flags Are Triggered

Red flags and labels are defined in the Rule Builder inside Agent Settings → Evaluation Rules. Each rule is a named condition set. When all conditions in the set evaluate to true for a call, Tuner applies the corresponding tag to that call.

Adding a Rule

Click + Add Rule in the Evaluation Rules tab to open the rule builder. Add Rule modal showing Condition Source, Operator, Value, Tag Type, Tag Name, and Description fields Each rule is configured with:
FieldDescription
Condition SourceThe signal to evaluate — an eval (e.g. hallucination), a voice metric, or a custom data field
OperatorThe comparison — Equals (=), greater than, less than, etc.
ValueThe threshold — e.g. PASS, FAIL, a numeric value, or a sentiment category
Tag TypeLabel (categorize and filter) or Red Flag (highlight failures that require attention)
Tag NameThe name that appears on the call in logs and the dashboard
DescriptionOptional — describe when this label or flag is relevant

Available Condition Types

You can build conditions from any of the following signal sources:
Condition typeExamples
Eval resulthallucination = FAIL, Politeness < 3, Confirmation accuracy = FAIL
Voice metricSentiment = Negative, Crosstalk Duration > 10s, Silence Duration > 15s
Call outcomeOutcome = Failure, Outcome = "User Hung Up Early"
Call durationDuration > 8 minutes
CostCost > $0.50
Combining condition types lets you express nuanced failure patterns. For example, a “Frustrated and Mishandled” flag might require Sentiment = Negative AND Missed Intent = FAIL AND Escalation Handling = FAIL — ensuring only calls with all three problems get flagged, not every slightly negative call.

Example Rules

Hallucination Detection

FieldValue
Tag TypeRed Flag
Tag Namehallucination_detected
Conditionshallucination = FAIL
WhyAny hallucination is a critical trust issue. Flag immediately regardless of call outcome.

Compliance Breach

FieldValue
Tag TypeRed Flag
Tag Namecompliance_failure
ConditionsIdentity verification compliance = FAIL OR Required disclosures = FAIL OR Prohibited content = FAIL
WhyAny single compliance failure warrants review. Use OR logic to catch any violation.

High-Risk Failed Call

FieldValue
Tag TypeRed Flag
Tag Namehigh_risk_failure
ConditionsOutcome = Failure AND Sentiment = Negative AND Duration > 5 minutes
WhyLong, negative calls that didn’t succeed represent maximum customer impact.

Sluggish Conversation (Label)

FieldValue
Tag TypeLabel
Tag Namesluggish_conversation
ConditionsSilence Duration > 15s OR TTFB p90 > 3s
WhyNot every slow call is a failure, but tagging them lets you filter and analyze the pattern.

Investigating a Red Flag

Clicking View on any entry in the Recent Red Flags table — or clicking a flagged call in the call logs — opens the Call Details view. From there:
  • Transcript — Read the conversation verbatim to see exactly what was said at the point of failure.
  • Events timeline — Trace the sequence of tool calls, transfers, and state transitions to find where the call went off track.
  • Eval results panel — See all eval scores for the call in one view. A flag triggered by hallucination = FAIL will show exactly which eval tripped and at what point in the transcript.
  • Voice metrics — Check for anomalies like extended silence, excessive crosstalk, or a large talk-time imbalance that correlate with the flag.

Investigating Patterns at Scale

When the same flag type recurs across many calls, investigating one at a time becomes impractical. The Tuner MCP Diagnose agent reads all calls sharing a flag type, clusters them by intent and outcome, surfaces the shared root cause, and can write a fix directly to your agent configuration.

Diagnose Your Agent

Find the root cause behind recurring red flags across all your calls.

Alerting on Red Flags

Red flags are silent by default — they surface in the dashboard and call logs but do not notify anyone. To get notified when a flag fires, create an alert in Agent Settings → Alerts. When creating an alert, you can select a specific red flag as the trigger field — either by individual flag name (e.g. hallucination_detected, negative_sentiment) or by Red Flag Count to fire when any flag exceeds a threshold within a time window. Create Alert modal showing red flag fields available as trigger conditions You can configure:
  • Trigger field — a specific red flag, a data field, or Red Flag Count
  • Operator and value — e.g. Equals FAIL, greater than 3
  • Count vs Percentage — fire when N calls match, or when X% of calls match
  • Time window — evaluate the condition over the last 1 hour, 24 hours, etc.
See How to Configure Real-Time Alerts for a full step-by-step guide.

Next Steps

How to Configure Real-Time Alerts

Route red flag notifications to Slack, email, or a webhook.

Diagnose Your Agent

Use the MCP agent to find root causes behind recurring red flags.