# Voice Judge

The Voice Judge evaluates voice agent calls directly from audio recordings, scoring each call across 10 quality dimensions. Scores measure the voice experience - pronunciation, clarity, pacing, interruption handling - independent of conversational logic or agent prompting.

Voice judge results are produced by a scheduled batch evaluation job. Once scored, results are available per-service through the API.

## Endpoints

### List Recent Voice Judge Results

{% hint style="info" %}
**Latency**: 500ms-2s. This endpoint reads from the analytics warehouse, not the primary database.
{% endhint %}

`GET /v1/{workspace_id}/services/{service_id}/voice-judge/recent`

Returns the most recent per-call voice quality scores for a service, ordered newest first.

#### Path Parameters

| Parameter      | Type          | Description          |
| -------------- | ------------- | -------------------- |
| `workspace_id` | string (UUID) | Workspace identifier |
| `service_id`   | string (UUID) | Service identifier   |

#### Query Parameters

| Parameter | Type    | Default | Description                         |
| --------- | ------- | ------- | ----------------------------------- |
| `limit`   | integer | 20      | Max rows to return. Min 1, max 100. |

#### Response

```json
{
  "service_id": "string",
  "count": 2,
  "items": [
    {
      "call_sid": "string",
      "call_entity_id": "string | null",
      "service_id": "string | null",
      "latency_dead_air_score": 0.9,
      "pronunciation_score": 1.0,
      "clarity_score": 0.25,
      "filler_silence_score": 0.5,
      "interruption_handling_score": 1.0,
      "audio_consistency_score": 1.0,
      "pacing_score": 1.0,
      "warmth_tone_score": 1.0,
      "accent_quality_score": 1.0,
      "voice_identity_score": 1.0,
      "overall_score": 0.865,
      "critical_count": 0,
      "flag_count": 1,
      "warning_count": 1,
      "judge_json": "string | null",
      "computed_at": "2026-05-15T12:00:00Z"
    }
  ]
}
```

#### Response Fields

| Field        | Type    | Description                         |
| ------------ | ------- | ----------------------------------- |
| `service_id` | string  | The service these results belong to |
| `count`      | integer | Number of items returned            |
| `items`      | array   | List of voice judge result rows     |

#### Voice Judge Result Row

| Field                         | Type                      | Description                                                                                      |
| ----------------------------- | ------------------------- | ------------------------------------------------------------------------------------------------ |
| `call_sid`                    | string                    | Call identifier                                                                                  |
| `call_entity_id`              | string or null            | Entity identifier for the call                                                                   |
| `service_id`                  | string or null            | Service identifier                                                                               |
| `latency_dead_air_score`      | number or null            | Response latency and dead air score (P0). 0.0-1.0                                                |
| `pronunciation_score`         | number or null            | Pronunciation accuracy score (P0). 0.0-1.0                                                       |
| `clarity_score`               | number or null            | Speech clarity and intelligibility score (P0). 0.0-1.0                                           |
| `filler_silence_score`        | number or null            | Filler and silence management score (P1). 0.0-1.0                                                |
| `interruption_handling_score` | number or null            | Barge-in and recovery score (P1). 0.0-1.0                                                        |
| `audio_consistency_score`     | number or null            | Audio consistency score (P1). 0.0-1.0                                                            |
| `pacing_score`                | number or null            | Speech rate and pausing score (P2). 0.0-1.0                                                      |
| `warmth_tone_score`           | number or null            | Emotional tone appropriateness score (P2). 0.0-1.0                                               |
| `accent_quality_score`        | number or null            | Language and accent match score (P2). 0.0-1.0                                                    |
| `voice_identity_score`        | number or null            | Voice consistency across the call score (P2). 0.0-1.0                                            |
| `overall_score`               | number or null            | Composite score (arithmetic mean of dimension scores). 0.0-1.0                                   |
| `critical_count`              | integer or null           | Number of dimensions with Critical severity                                                      |
| `flag_count`                  | integer or null           | Number of dimensions with Flag severity                                                          |
| `warning_count`               | integer or null           | Number of dimensions with Warning severity                                                       |
| `judge_json`                  | string or null            | Raw judge output with per-dimension evidence quotes and severity. Opaque string for UI drill-in. |
| `computed_at`                 | string (ISO 8601) or null | When the evaluation was computed                                                                 |

#### Score Interpretation

All dimension scores range from 0.0 to 1.0:

| Score Range | Severity | Meaning                        |
| ----------- | -------- | ------------------------------ |
| 0.75 - 1.0  | None     | Meets the bar                  |
| 0.5 - 0.74  | Warning  | Minor quality pattern detected |
| 0.25 - 0.49 | Flag     | Notable quality issue          |
| 0.0 - 0.24  | Critical | Significant quality problem    |

#### Error Responses

| Status | Description                                                   |
| ------ | ------------------------------------------------------------- |
| 404    | Service not found in this workspace                           |
| 503    | Analytics warehouse not configured or transiently unavailable |

A 200 response with an empty `items` list means no calls have been evaluated yet for this service. A 503 response means the analytics infrastructure is temporarily unavailable - retry after a short delay.

## Dimensions

The voice judge evaluates 10 dimensions, grouped by priority:

### P0 - Critical Quality

* **Latency and Dead Air** - Response latency between turns. Flags prolonged silence (>3s between turns) and extended processing waits without verbal acknowledgment.
* **Pronunciation** - Correct pronunciation of medical terms, drug names, dates, numbers, and patient names. Critical on any factual read-back error.
* **Clarity** - Speech intelligibility and clean audio output. Critical on garbled or unintelligible speech.

### P1 - Important Quality

* **Filler and Silence Management** - Graceful handling of processing pauses. Verbal acknowledgment before a pause, no dead air during the wait, no repeated filler phrases, and filler that matches the result being delivered.
* **Interruption Handling** - Clean barge-in behavior. Agent stops when the caller speaks, no false triggers on background noise, smooth recovery after being interrupted.
* **Audio Consistency** - Absence of volume spikes, pitch anomalies, mid-word cutoffs, or inconsistent voice timbre across turns.

### P2 - Quality Polish

* **Pacing** - Conversational speech rate with appropriate pauses between pieces of information and slower delivery for sensitive content.
* **Warmth and Tone** - Emotional appropriateness matched to the caller's state. Flags flat affect, tonal mismatches, or inappropriate emotional tone.
* **Accent and Language Quality** - Language and accent match to the caller. Critical on wrong language delivery.
* **Voice Identity** - Consistent agent voice and persona across the entire call.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.amigo.ai/developer-guide/platform-api/platform-api/voice-judge.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
