Simulations
Automated agent testing with personas, scenarios, unit tests, and configurable success criteria.
Amigo's simulation system is an evaluation and testing framework for validating agent behavior before deploying to production. It enables you to define simulated users (personas), test scenarios, and success criteria, then run automated conversations to measure how your agent performs.
How Simulations Work
The simulation system uses five building blocks that compose together:
Building Blocks
Simulated user profiles with a background, role, and preferred language. Versioned so you can iterate on persona definitions without breaking existing tests.
Conversation scripts that define the objective, instructions for the simulated user, and how the conversation starts. Also versioned.
Combine a persona, a scenario, a service (with version set), and success criteria (metrics with thresholds) into a single test case.
Group multiple unit tests together, each with a configurable run count, to form a test suite.
Execute a unit test set. The platform runs all unit tests, evaluates metrics, and produces downloadable artifacts with the results.
Typical Workflow
Define personas that represent different user archetypes (e.g., "confused new user", "expert power user", "frustrated customer").
Define scenarios that describe what the simulated user is trying to accomplish and how the conversation should start.
Create unit tests that pair a persona with a scenario, target a specific service and version set, and set success criteria based on conversation metrics.
Group unit tests into sets with run counts (e.g., run each test 5 times for statistical significance).
Execute runs and review artifacts to see whether your agent meets the defined success criteria.
Versioning
Personas and scenarios are versioned independently. When you update a persona's background or a scenario's instructions, you create a new version. Unit tests reference the persona and scenario by ID and always use the latest version at run time. This lets you iterate on test definitions without recreating unit tests.
Tool Execution Modes
During simulations, tools are invoked with invocation_mode: "conversation-simulation" instead of "regular". This lets your tools mock external calls and avoid side effects. See Tools: Execution Modes for implementation details.
API Categories
Personas
Simulation Personas -- Create, list, search, update, delete, and version simulated user profiles.
Scenarios
Simulation Scenarios -- Create, list, search, update, delete, and version conversation test scenarios.
Unit Tests
Simulation Unit Tests -- Create, list, search, update, and delete individual test cases.
Unit Test Sets
Simulation Unit Test Sets -- Create, list, search, update, and delete grouped test suites.
Unit Test Set Runs
Simulation Unit Test Set Runs -- Execute test suites, monitor progress, cancel runs, and download result artifacts.
Related
Core API --> Services
Core API --> Tools
Data Access --> Simulation Tables
Getting Started --> Authentication
Last updated
Was this helpful?

