# Voice Simulation

Voice simulation (VoiceSim) evaluates how changes to voice configuration parameters affect call quality. It runs configurations across a structured set of scenarios and scores the results.

## The Problem

Voice agents have many interdependent configuration parameters - barge-in sensitivity, speech speed, empathy responsiveness, silence tolerance, filler behavior, and more. These parameters interact in ways that are difficult to predict. A barge-in threshold that works well for normal conversations may cause problems during crisis calls. A speech speed that feels natural in short calls may become exhausting in long ones.

Manual tuning is slow and incomplete. You adjust one parameter, make a few test calls, and check the results. VoiceSim replaces this with structured exploration across the full configuration space.

## How It Works

VoiceSim treats voice configuration as a multi-dimensional space where each dimension represents a tunable parameter. The system:

1. **Defines the search space.** Each voice parameter is mapped to a dimension with discrete quantization bins. This converts continuous tuning into a structured grid that can be searched efficiently.
2. **Samples configurations.** Latin Hypercube Sampling (LHS) generates an initial set of configurations that cover the space uniformly, avoiding clustering in any region.
3. **Runs scenarios.** Each configuration is evaluated against a set of built-in scenarios that exercise different aspects of voice behavior - normal conversations, long multi-turn calls, crisis situations, frequent barge-ins, silent callers, and speech recognition failures.
4. **Scores results.** A quality oracle scores each (configuration, scenario) pair based on penalties for specific failures: dead air, greeting interruption, crisis response delays, context loss, and safety violations.
5. **Reports findings.** Results are aggregated into best-per-scenario rankings, penalty frequency analysis, and configuration diffs from production defaults.

## Configuration Dimensions

VoiceSim evaluates 18 primary dimensions covering the core voice parameters:

| Category     | Dimensions                                                                 |
| ------------ | -------------------------------------------------------------------------- |
| **Barge-in** | Minimum speech duration, shield duration, cooldown period                  |
| **Speed**    | Base speech rate, rate adjustments for emotion/complexity                  |
| **Empathy**  | Emotional responsiveness, empathy escalation thresholds                    |
| **Safety**   | Crisis detection sensitivity, escalation triggers                          |
| **Context**  | Context window size, history summarization thresholds                      |
| **Silence**  | Silence timeout, backchanneling frequency                                  |
| **Tools**    | Tool execution timeouts, concurrent tool capacity                          |
| **Filler**   | Filler style (phrase, backchannel, silent), vocabulary, backchannel timing |
| **Response** | Maximum sentences per response, maximum words per response                 |

These dimensions map directly to the per-service voice configuration, so optimal settings discovered through VoiceSim can be applied to production services immediately. The dimension registry is extensible - new dimensions can be added as you discover parameters that affect call quality in your specific use case.

## Built-in Scenarios

VoiceSim ships with 8 scenarios designed to cover the most common voice quality issues:

| Scenario                 | What It Tests                                                     |
| ------------------------ | ----------------------------------------------------------------- |
| **Normal conversation**  | Baseline 10-turn interaction                                      |
| **Long conversation**    | 50-turn interaction testing context coherence over extended calls |
| **Greeting barge-in**    | Caller speaks during the agent's greeting                         |
| **Frequent barge-in**    | Repeated interruptions throughout the call                        |
| **Crisis at turn 10**    | Caller becomes distressed early in the conversation               |
| **Crisis at turn 40**    | Caller becomes distressed late in a long conversation             |
| **Silent caller**        | Extended pauses between caller turns                              |
| **STT failure mid-call** | Speech recognition degrades during the conversation               |

## Full-Fidelity Simulation

VoiceSim runs execute the same reasoning pipeline as live calls. Each simulation step processes actual language model conversations, actual tool execution, and actual empathy classification through the platform's [reasoning engine](https://docs.amigo.ai/agent/reasoning-engine). There is no simplified or mocked version of the agent logic - what you test is what runs in production.

This means simulation results reflect actual agent behavior under each configuration, including:

* **Tool execution** - Simulated calls invoke the same tools (scheduling, patient lookup, FHIR operations) as live calls, with results feeding back into the conversation.
* **Empathy classification** - The agent's emotional responsiveness is evaluated in real time, so configurations that affect empathy behavior produce authentic results.
* **Call intelligence** - Each simulation step produces quality scores and analytics, the same call intelligence metrics generated for live calls. This lets you compare configuration performance using the same scoring your production monitoring uses.

## Write Isolation

Simulated tool calls (patient creation, appointment scheduling) execute against isolated database branches so they do not affect production data. Each simulation run gets an ephemeral copy-on-write branch that is automatically cleaned up after the run completes.

## Using VoiceSim

VoiceSim is available through the Platform API, the Developer Console, and the Agent Forge CLI:

* **Developer Console** - The simulations page provides a real-time exploration UI with score timelines, scenario radar charts, configuration diffs, and live progress feeds.
* **Agent Forge CLI** - `forge platform sim` commands for creating runs, sampling configurations, evaluating points, and reviewing results from the terminal.
* **Platform API** - Full REST API for programmatic integration with CI/CD pipelines and custom orchestration workflows.

{% hint style="info" %}
**Developer Guide** - For API endpoints and request/response schemas, see the [Voice Simulation](https://docs.amigo.ai/developer-guide/platform-api/voice-simulation) reference in the developer guide.
{% endhint %}
