# Agent Forge CLI

Agent Forge is the CLI tool for managing agent configurations on the Amigo platform. It lets you create, update, version, and promote agent components programmatically rather than through the web interface.

## Installation

Agent Forge ships as a single binary with no runtime dependencies. The installer detects your OS and architecture, downloads the correct binary, verifies the SHA256 checksum, and places it on your PATH.

```bash
# macOS / Linux / WSL
curl -fsSL https://forge.platform.amigo.ai/install.sh | sh

# Windows PowerShell
irm https://forge.platform.amigo.ai/install.ps1 | iex
```

Pre-built binaries are available for macOS (Intel and Apple Silicon), Linux (amd64 and arm64), and Windows (amd64). No Python, no package manager, no dependency resolution required.

After installation, configure credentials for your workspace:

```bash
# Create environment file
cp .env.platform.example .env.platform.<your-env>
# Edit with your Platform API URL, workspace ID, and API key or identity URL

# Verify
forge auth status --platform --env <your-env>
```

## What Agent Forge Does

Agent Forge treats agent configurations as code. You sync configurations to local JSON files, make changes, and push them back to the platform. This gives you version control, reproducibility, and the ability to script deployment workflows.

### Authentication

Agent Forge supports two authentication surfaces that correspond to the two API backends:

* **Legacy backend API** - Uses Firebase device code (Google Sign-In) when configured with a Google tenant ID, or static API key credentials. This is the default when you run `forge auth login`.
* **Platform API** - Uses platform identity device code authentication (OAuth 2.0 Device Authorization Grant, RFC 8628). Activate this path with the `--platform` flag: `forge auth login --platform`.

Both flows follow the same user experience. Forge displays a short user code and opens your browser to an approval page. You verify that the code shown in the browser matches the code in your terminal and approve the request. Forge then receives an access token and refresh token automatically - no manual token management required. These flows work in headless environments, SSH sessions, and CI pipelines where a browser cannot be opened inline.

Tokens for each surface are cached independently in the system keyring. An expired access token is silently refreshed using the stored refresh token without requiring re-authentication.

#### Environment Configuration

Platform API authentication reads from `.env.platform.<env>` (preferred) or falls back to `.env.<env>`. The following variables control the platform auth path:

| Variable                | Required     | Description                                               |
| ----------------------- | ------------ | --------------------------------------------------------- |
| `PLATFORM_API_URL`      | Yes          | Base URL for the Platform API                             |
| `PLATFORM_WORKSPACE_ID` | Yes          | Workspace to authenticate against                         |
| `PLATFORM_API_KEY`      | One of these | Static API key (no login required)                        |
| `IDENTITY_URL`          | One of these | Platform identity service URL (enables device code login) |

If `PLATFORM_API_KEY` is set, Forge uses it as a static bearer token. If `IDENTITY_URL` is set instead, Forge uses the device code flow via `forge auth login --platform`.

Legacy backend authentication continues to read from `.env.<env>` using `API_KEY`, `API_KEY_ID`, `API_KEY_USER_ID`, or `GOOGLE_TENANT_ID`.

Forge-native configuration fields are automatically translated to platform-native equivalents at deployment time. For example, audio filler phrases defined in Forge tool specs are converted to the platform's progress hint format, so agents configured through Forge work without manual migration.

#### Auth Commands

```bash
# Legacy backend login (Firebase device code)
forge auth login -e myorg

# Platform API login (platform identity device code)
forge auth login --platform -e myorg

# Check auth status
forge auth status -e myorg
forge auth status --platform -e myorg

# Clear cached credentials
forge auth logout -e myorg
forge auth logout --platform -e myorg
```

The `--platform` flag is available on `login`, `logout`, and `status` subcommands. Without it, all auth commands operate on the legacy backend credentials.

Agent Forge manages the following entity types:

* **Agents**: Persona, background, directives, and communication style
* **Context graphs**: Problem structure, states, transitions, and safety boundaries
* **Dynamic behaviors**: Runtime behaviors with triggers and response logic
* **Metrics**: Evaluation criteria, scoring rubrics, and custom metric definitions
* **Personas**: Synthetic user profiles for simulation testing (the primary way to manage personas)
* **Scenarios**: Test situations for simulation testing
* **Unit test sets**: Groups of tests with success criteria

## Pre-Sync Validation

Agent Forge validates context graphs before syncing to the platform and surfaces warnings for common authoring mistakes. Validation runs automatically during `sync-to-remote` with no additional configuration.

### Canonical Value Lint

The canonical value lint detects phone numbers, email addresses, and URLs hardcoded into context graph state prose. Inline canonical values cause silent data drift - when graphs are cloned or updated, hardcoded digits can be accidentally mutated, and the agent reads incorrect information to callers.

The validator scans prose fields in every state (descriptions, instructions, boundary constraints, exit conditions, and action descriptions) and emits a warning for each match, identifying the state, field, and value. It catches phone numbers in digit form (e.g., `555-010-1234`), phone numbers in spelled-out TTS form (e.g., "five five five zero one zero..."), email addresses, and URLs.

To fix a warning, move the canonical value into structured context - such as a location entity in the world model or a workspace setting - and reference it abstractly in the state prose.

## Core Operations

### Sync to Local

Pull configurations from the platform to your local file system:

```bash
# Pull all active agents
forge sync-to-local --entity-type agent --active-only

# Pull context graphs with a specific tag
forge sync-to-local --entity-type context_graph --tag emergency

# Pull evaluation framework components
forge sync-to-local --entity-type metric --tag accuracy
forge sync-to-local --entity-type persona --tag emergency_patient
forge sync-to-local --entity-type scenario --tag complex_symptoms
```

### Platform Insights

Query workspace data, explore schema metadata, and get health digests through the platform insights service:

```bash
# Execute a SQL query against the workspace data warehouse
forge platform insights sql "SELECT ..." --env myorg

# Read SQL from a file
forge platform insights sql --sql-file my_query.sql --env myorg --json

# List available tables, columns, and functions
forge platform insights schema --env myorg

# Get a workspace health digest with entity counts and data quality signals
forge platform insights digest --env myorg

# Get suggested starter questions for exploring workspace data
forge platform insights suggestions --env myorg
```

### Call Trace Analysis

Access deep call understanding from the intelligence pipeline, including emotional arcs, key decision moments, coaching recommendations, and signal-response alignment:

```bash
# List recent trace analyses
forge platform trace list --env myorg

# Filter by outcome and lookback window
forge platform trace list --outcome failed --days 7 --env myorg

# Get detailed trace analysis for a specific call
forge platform trace get <call-sid> --env myorg
```

### Simulation Caller ID

The `session-create`, `smoke-test`, and `bridge` simulation commands accept a `--caller-id` flag to set a simulated caller phone number in E.164 format (e.g. `+16479718862`). When set, the agent resolves the number as a known caller so the session starts with full patient context - useful for testing caller-specific behavior like greeting a known patient by name or loading their clinical history. Omit the flag to simulate an unknown caller.

```bash
# Create a session with a known caller
forge platform sim session-create --service-id <uuid> --caller-id +16479718862 --env myorg

# Smoke test as a known caller
forge platform sim smoke-test --service-id <uuid> --caller-id +16479718862 --env myorg

# Bridge scenarios as a known caller
forge platform sim bridge --service-id <uuid> -o "Test known caller flow" --caller-id +16479718862 --env myorg
```

Trace analysis provides:

* **Emotional arc** - How caller sentiment evolved across the conversation
* **Key decision moments** - Critical points with quality assessment and causal attribution
* **Coaching recommendations** - Actionable improvements tied to specific call moments
* **Counterfactuals** - Alternative actions that could have changed the outcome
* **Signal-response alignment** - Whether the agent responded appropriately to caller signals
* **Interaction dynamics** - Turn-taking quality, rapport trajectory, and repair effectiveness

### Sync to Remote

Push local changes back to the platform:

```bash
# Push all changes to staging
forge sync-to-remote --all --apply --env staging

# Push all changes to production
forge sync-to-remote --all --apply --env production
```

Before applying changes, Agent Forge shows exactly what will be modified so you can review before confirming.

### Environment Support

Agent Forge supports separate staging and production environments. Changes are deployed to staging first, validated through testing, and then promoted to production.

```
agent-forge/
  local/
    staging/
      entity_data/
        agent/
        context_graph/
        dynamic_behavior_set/
        metric/
        persona/
        scenario/
        unit_test_set/
    production/
      entity_data/
        (same structure)
```

## Typical Workflow

1. **Pull current configurations** from the platform to your local environment.
2. **Make changes** to the JSON configuration files.
3. **Push to staging** and run your test sets to validate.
4. **Review results** and iterate if tests fail.
5. **Promote to production** after validation passes.

This workflow supports both manual changes and automated optimization. Teams can use Agent Forge directly for planned configuration updates, or set up automated pipelines that use Agent Forge to deploy and test changes as part of a continuous improvement process.

## Analytics

The `forge analyze` command group provides SQL-based exploration of workspace data directly from the CLI, replacing the need for external analytics tools.

### Query Commands

| Command                  | Description                                                                                                   |
| ------------------------ | ------------------------------------------------------------------------------------------------------------- |
| `forge analyze query`    | Execute ad-hoc SQL SELECT queries (inline or from file). Results are capped and queries are time-bounded.     |
| `forge analyze describe` | Preview a query's output schema without executing it - useful for validating JOINs and checking column types. |
| `forge analyze tables`   | List available tables in the workspace schema. Supports SQL LIKE patterns for filtering.                      |
| `forge analyze schema`   | Describe a table's columns: names, data types, and comments.                                                  |
| `forge analyze sample`   | Preview sample rows from a table (default 5, max 20).                                                         |
| `forge analyze detail`   | Rich table metadata: row count, size, partitioning, column nullability, data freshness.                       |
| `forge analyze profile`  | Profile a column's data distribution: cardinality, null rate, min/max values.                                 |
| `forge analyze catalog`  | Display the full data catalog reference offline without a database connection.                                |

### Query Templates

Pre-built analytics query templates for common patterns like conversation volume, tool performance, and metric trends:

```bash
# List templates
forge analyze template list

# Run a template with parameters
forge analyze template run conversation-volume -P days=7
```

All commands support `--json` for structured output, enabling integration with scripts and CI/CD pipelines.

## Voice Simulation

The `forge platform sim` command group provides CLI access to VoiceSim for exploring voice configuration space:

| Command                       | Description                                                     |
| ----------------------------- | --------------------------------------------------------------- |
| `forge platform sim create`   | Create a new simulation run                                     |
| `forge platform sim list`     | List simulation runs for the workspace                          |
| `forge platform sim sample`   | Sample and evaluate N configuration points                      |
| `forge platform sim evaluate` | Evaluate a specific configuration point against a scenario      |
| `forge platform sim get`      | Get run status and best results                                 |
| `forge platform sim summary`  | Aggregated summary with best-per-scenario and penalty frequency |
| `forge platform sim points`   | List scored points (by score or chronologically)                |
| `forge platform sim complete` | Mark a run as finished                                          |

See [Voice Simulation](/testing/testing/voice-simulation.md) for conceptual background.

### Platform Simulation Testing

For testing agent configurations through platform simulation sessions (distinct from VoiceSim configuration tuning):

| Command                           | Description                                                    |
| --------------------------------- | -------------------------------------------------------------- |
| `forge platform sim smoke-test`   | Single-turn sanity check via a tracked platform session        |
| `forge platform sim bridge`       | Multi-scenario AI-driven testing using tracked simulation runs |
| `forge platform sim run-create`   | Create a tracked simulation run                                |
| `forge platform sim run-list`     | List simulation runs with filtering                            |
| `forge platform sim run-complete` | Mark a simulation run as complete                              |

Smoke-test creates a session, sends a test message, and reports the agent's response. Bridge generates diverse scenarios, runs multi-turn conversations with an LLM caller persona, and tracks results. Both create tracked runs that appear in the Agent Performance dashboard.

## Text Conversation Smoke Tests

Agent Forge provides commands for verifying that text conversation endpoints are working correctly. These are useful during initial setup, after configuration changes, or as part of a deployment validation pipeline.

### Send Message (REST)

Send a single user message through the text conversation REST endpoint and display the agent's response:

```bash
# Start a new conversation
forge platform conversation create
forge platform conversation send-message \
  --service-id <uuid> \
  --message "What appointments are available tomorrow?" \
  --env myorg

# Resume an existing conversation
forge platform conversation send-message \
  --service-id <uuid> \
  --message "How about 2pm?" \
  --conversation-id <uuid> \
  --env myorg
```

Omit `--conversation-id` to start a new durable conversation. Pass the returned conversation ID on subsequent calls to continue the same conversation thread.

### WebSocket Smoke Test

Open a WebSocket text-stream connection, wait for session initialization, send one message, and verify the agent responds:

```bash
forge platform conversation text-ws-smoke \
  --service-id <uuid> \
  --env myorg

# With custom message and timeout
forge platform conversation text-ws-smoke \
  --service-id <uuid> \
  --message "Schedule me for Thursday" \
  --timeout 60 \
  --env myorg --json
```

The command reports pass or fail, the conversation ID, and the agent's response text. Use `--json` for machine-readable output in CI pipelines.

## Conversation Quality Check

The `forge quality check` command scans workspace conversations for agent behavioral issues - stuck loops, degenerate output, repetition, and other quality problems. It queries production conversation data directly and runs pattern-based detectors to surface problematic interactions.

```bash
# Scan last 24 hours
forge quality check <workspace-name>

# Wider window with message snippets
forge quality check <workspace-name> --days 7 --verbose

# Structured output for scripting
forge quality check <workspace-name> --json
```

### Detectors

| Detector                   | What It Finds                                                               |
| -------------------------- | --------------------------------------------------------------------------- |
| **Character degeneration** | Repeated characters, low entropy output, stuttering patterns                |
| **Stuck agent loops**      | Agent repeats the same response while the caller changes topics             |
| **Repetitive patterns**    | High similarity across sliding message windows                              |
| **Word salad**             | Incoherent output patterns like or-chains and excessive word repetition     |
| **Phantom success**        | Agent claims a tool call succeeded when the tool actually returned an error |

Results include conversation IDs, timestamps, detector names, and severity. Use `--verbose` to see the actual message excerpts that triggered each finding.

See [Voice Simulation](/testing/testing/voice-simulation.md) and [Drift Detection](/testing/testing/drift-detection.md) for related quality monitoring capabilities.

## Tool Testing

The `forge platform tool-test` commands let you test context graph tools without making phone calls:

| Command                            | Description                                                     |
| ---------------------------------- | --------------------------------------------------------------- |
| `forge platform tool-test resolve` | List available tools for a service with input schemas           |
| `forge platform tool-test execute` | Execute a tool with custom parameters and optional dry run mode |

## Text Conversation Testing

The `forge platform conversation` command group tests text conversations through the REST and WebSocket APIs without a phone or browser.

| Command                                     | Description                                                                                                                                                                                                  |
| ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `forge platform conversation send-message`  | Send a single message via the REST API and print the agent's response. Supports `--conversation-id` to resume an existing conversation.                                                                      |
| `forge platform conversation text-ws-smoke` | Open a WebSocket to the text-stream endpoint, wait for the agent's greeting, send a test message, and print the response. Verifies the full WebSocket lifecycle including auth, session start, and greeting. |

```bash
# REST smoke test - new conversation
forge platform conversation send-message --service-id <uuid> -m "I need an appointment" --env myorg

# REST smoke test - resume existing conversation
forge platform conversation send-message --service-id <uuid> -m "Next Tuesday" -c <conversation-id> --env myorg

# WebSocket smoke test
forge platform conversation text-ws-smoke --service-id <uuid> -m "Hello" --env myorg

# With patient context
forge platform conversation send-message --service-id <uuid> -m "Refill my prescription" --entity-id <uuid> --env myorg
```

Both commands support `--json` for structured output and `--env` for environment selection.

## CLI Updates

Agent Forge includes a built-in update mechanism:

```bash
# Check for and apply updates manually
forge update
```

The CLI also checks for updates automatically in the background (every 30 minutes). When updates are available, the CLI prompts before applying. Uncommitted local changes are stashed during the update and restored afterward.

## Metric Versioning

Metrics support version tracking. Each metric can have multiple versions, with the `latest_version` field tracking the current iteration. This enables teams to evolve evaluation criteria over time while maintaining a history of how metrics were defined at each point. Older metric configurations are automatically migrated to the versioned schema.

## Authentication

Agent Forge supports three authentication methods, selected automatically based on the environment configuration.

### Platform Identity (Recommended)

Device code authentication following RFC 8628. When you run `forge auth login`, the CLI requests a device code from the identity service, opens your browser to an approval page, and polls for authorization. Once approved, the access token and refresh token are cached in your system keyring. Subsequent commands use the cached token and refresh silently when it expires.

```bash
# Log in with Platform Identity
forge auth login --env myorg

# Check current auth status
forge auth status --env myorg

# Log out and clear cached tokens
forge auth logout --env myorg
```

Platform Identity is used when the environment configuration includes an identity URL. It replaces the need for a static API key for interactive CLI use.

### API Key

Static bearer token authentication. Generate an API key from the Developer Console (**Settings > API Keys**) and add it to your environment file. API keys do not expire and are suitable for CI/CD pipelines and automated scripts where interactive login is not possible.

## Platform API Commands

Agent Forge provides full CLI coverage for the Platform API, enabling end-to-end agent building and workspace management without the web interface. The `forge platform` command group provides CLI access to every platform resource.

### Setup

Configure authentication using one of the methods above. For API key authentication, add to your environment file:

```bash
PLATFORM_API_URL=https://api.platform.amigo.ai
PLATFORM_WORKSPACE_ID=<workspace-uuid>
PLATFORM_API_KEY=<bearer-token>
```

### E2E Agent Building

Build a complete agent from the CLI in four steps:

1. **Create agent** and agent version with identity, background, and behaviors
2. **Create context graph** and version with states, transitions, and exit conditions
3. **Create service** linking the agent and context graph together
4. **Add skills** (optional) for LLM-backed micro-agent capabilities

```bash
forge platform agent create --name "My Agent" --env myorg
forge platform agent create-version <agent-uuid> --file agent-version.json --env myorg
forge platform context-graph create --name "My Context Graph" --env myorg
forge platform context-graph create-version <context-graph-uuid> --file context-graph-version.json --env myorg
forge platform service create --name "My Service" --agent-id <agent-uuid> --context-graph-id <context-graph-uuid> --env myorg
```

### Resource Management

Full CRUD operations for all platform resources:

| Resource Group   | Commands                                                                                                                                                         |
| ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Core**         | `agent`, `context-graph`, `service`, `skill`, `integration`, `persona`                                                                                           |
| **Voice & Text** | `call`, `conversation` (text REST + WebSocket smoke), `recording`, `operator`, `phone-number`, `session` (voice + text), `outbound-trigger`, `escalation-policy` |
| **Data**         | `data-source`, `world`, `fhir`, `crm`, `unification-rule`, `pipeline`, `function`                                                                                |
| **Automation**   | `trigger` (create, list, get, update, delete, pause, resume, fire, runs)                                                                                         |
| **Operations**   | `audit`, `compliance`, `safety`, `monitor-concept`, `review-queue`                                                                                               |
| **Settings**     | `workspace`, `settings`, `voice-settings`, `api-key`, `task`, `billing`, `network`                                                                               |
| **Surfaces**     | `surface` (create, deliver, list, e2e)                                                                                                                           |
| **Analytics**    | `analytics`, `command-center`                                                                                                                                    |
| **Testing**      | `sim` (voice simulation), `coverage` (B\&B exploration), `tool-test`, `metrics` (settings, define, evaluate)                                                     |
| **Intelligence** | `insights` (SQL analytics, schema, digest), `trace` (call trace analysis)                                                                                        |

All commands support `--json` for structured output and `--env` for environment selection.

### Bulk Push

Push local entity configurations to the platform in a single operation:

```bash
forge platform push --all --env myorg --apply
```

Supports selective push by entity type (`-e agent`, `-e context-graph`, `-e service`).

## Trigger Management

The `forge platform trigger` command group manages scheduled action triggers - cron-based automation that dispatches workspace actions on a recurring basis.

| Command                         | Description                                            |
| ------------------------------- | ------------------------------------------------------ |
| `forge platform trigger create` | Create a trigger with cron schedule and action binding |
| `forge platform trigger list`   | List triggers with active/inactive filtering           |
| `forge platform trigger get`    | Get trigger details including next fire time           |
| `forge platform trigger update` | Update trigger configuration                           |
| `forge platform trigger delete` | Delete a trigger                                       |
| `forge platform trigger pause`  | Pause a trigger's schedule                             |
| `forge platform trigger resume` | Resume a paused trigger                                |
| `forge platform trigger fire`   | Manually fire a trigger for testing                    |
| `forge platform trigger runs`   | View trigger execution history                         |

Triggers bind cron schedules to actions. When the schedule fires, the action is dispatched immediately. Each execution is tracked as an event with `AUTOMATION` source provenance. See [Outbound](/channels/outbound.md) for how triggers fit into the platform's automated contact patterns.

## Platform Functions

The `forge platform function` command group manages platform functions - declarative SQL, Python, and AI functions that agents can call mid-conversation.

| Command                            | Description                                                        |
| ---------------------------------- | ------------------------------------------------------------------ |
| `forge platform function register` | Register a new platform function with its definition and metadata  |
| `forge platform function list`     | List all registered functions in the workspace                     |
| `forge platform function test`     | Execute a function with test parameters and inspect the result     |
| `forge platform function delete`   | Remove a function registration                                     |
| `forge platform function query`    | Run an open-scope SQL query against workspace data                 |
| `forge platform function catalog`  | Display the full function catalog with signatures and descriptions |
| `forge platform function sync`     | Sync function definitions between local files and the platform     |

See [Platform Functions](/agent/platform-functions.md) for conceptual background.

## Escalation Policy

The `forge platform service escalation-policy` command configures how a service handles escalation triggers - what happens when the agent determines a call should be escalated to a human operator, forwarded to another number, or ended.

```bash
# View current escalation policy
forge platform service escalation-policy <service-id> --get --env myorg

# Apply a preset (all triggers route to the same action)
forge platform service escalation-policy <service-id> --preset operator-all --env myorg
forge platform service escalation-policy <service-id> --preset forward-all --env myorg
forge platform service escalation-policy <service-id> --preset hangup-all --env myorg

# Configure individual triggers with custom actions
forge platform service escalation-policy <service-id> --body '{"safety": {"type": "operator"}, "user_request": {"type": "forward", "phone_number": "+15551234567"}}' --env myorg
```

Three presets cover the common cases where all triggers should route to the same action type. For fine-grained control, `--body` accepts a partial policy that merges with the current configuration - you can update individual triggers without re-stating the entire policy.

See [Operators and Escalation](/channels/operators.md) for conceptual background on escalation triggers and actions.

## Metrics Management

The `forge platform metrics` command group manages workspace metric definitions and runs evaluations from the CLI.

| Command                           | Description                                                |
| --------------------------------- | ---------------------------------------------------------- |
| `forge platform metrics settings` | View or update workspace-level metric configuration        |
| `forge platform metrics define`   | Create or update a metric definition with scoring criteria |
| `forge platform metrics evaluate` | Run metric evaluation against one or more conversations    |

```bash
# View current metric settings
forge platform metrics settings --env myorg

# Define a metric
forge platform metrics define --name "Scheduling Success" --file metric-def.json --env myorg

# Evaluate a metric against recent conversations
forge platform metrics evaluate --metric-id <uuid> --conversation-ids <id1>,<id2> --env myorg
```

## Simulation Coverage

The `forge platform coverage` command group manages branch-and-bound simulation coverage runs that systematically explore context graph state space.

| Command                            | Description                                                                       |
| ---------------------------------- | --------------------------------------------------------------------------------- |
| `forge platform coverage create`   | Create a new coverage run for a service                                           |
| `forge platform coverage session`  | Create a session within a coverage run                                            |
| `forge platform coverage step`     | Step a session forward with a simulated user message                              |
| `forge platform coverage fork`     | Fork a session into N children at a decision point, each with a different message |
| `forge platform coverage score`    | Score a session against configured metrics                                        |
| `forge platform coverage graph`    | Retrieve the coverage knowledge graph with topology overlay and ghost nodes       |
| `forge platform coverage complete` | Complete a run and clean up ephemeral database branches                           |

See [Simulation Coverage](/testing/testing/simulations.md#simulation-coverage) for conceptual background.

## Insights

The `forge platform insights` command group provides conversational data exploration from the CLI, wrapping the platform's [Insights Agent](/intelligence-and-analytics/intelligence.md#insights-agent) capabilities.

| Command                               | Description                                                                    |
| ------------------------------------- | ------------------------------------------------------------------------------ |
| `forge platform insights sql`         | Execute a SQL query against workspace data and return formatted results        |
| `forge platform insights schema`      | Describe available tables and columns in the workspace schema                  |
| `forge platform insights digest`      | Generate an AI-powered digest summarizing recent workspace activity and trends |
| `forge platform insights suggestions` | Get suggested queries based on the workspace's data and recent activity        |

## Trace Analysis

The `forge platform trace` command group provides call trace analysis from the CLI, wrapping the platform's [trace analysis](/intelligence-and-analytics/intelligence.md#call-trace-analysis) capabilities.

| Command                     | Description                                                                                                                                     |
| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| `forge platform trace list` | List call traces with filters for date range, service, quality score, and direction                                                             |
| `forge platform trace get`  | Get detailed trace analysis for a specific call, including emotional arc, decision moments, component attribution, and coaching recommendations |

Trace output uses rich formatting with colored outcome indicators and structured digest sections for quick scanning of call quality issues.

## Surface E2E Testing

The `forge platform surface e2e` command tests the full surface lifecycle from the CLI - surface creation, spec retrieval, form rendering, and data submission - in a single command. It validates that branding settings, field rendering, and data flow are working correctly end-to-end.

```bash
forge platform surface e2e --entity-id <uuid> --env myorg
```

This is useful for verifying surface configuration changes (branding, field types, sections) before deploying to production. The command exercises the same API paths and rendering pipeline that patients use.

## Simulation Testing

The `forge simulation` command group provides coverage-optimized simulation testing against context graphs. It automatically steers simulated conversations toward unvisited states, behaviors, and tools to maximize test coverage.

### How It Works

Each simulation turn follows a scoring loop:

1. The platform generates recommended user responses (graph-unaware)
2. An LLM classifier predicts which state each response would transition to
3. A scorer ranks responses by expected coverage value using graph structure
4. The highest-scoring response is sent as the simulated user message
5. Coverage state is updated based on the agent's response

### Commands

| Command                     | Description                                                                                                                                                          |
| --------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `forge simulation run`      | Execute a simulation with configurable sessions, turn budgets, and coverage targets                                                                                  |
| `forge simulation plan`     | Generate a target spec from a natural-language objective (e.g., "test the cancellation flow end-to-end")                                                             |
| `forge simulation bridge`   | Generate scenario variations from a natural-language objective, run multi-turn conversations with LLM-driven personas, and track coverage using interaction insights |
| `forge simulation evaluate` | Compare metric scores across simulation runs, including before/after diff mode                                                                                       |
| `forge simulation cleanup`  | Delete ephemeral test users created by simulation runs                                                                                                               |

### Configuration

Simulations are highly configurable:

| Setting         | Default    | Description                                                                                              |
| --------------- | ---------- | -------------------------------------------------------------------------------------------------------- |
| **Sessions**    | 3          | Number of parallel conversations                                                                         |
| **Max turns**   | 20         | Maximum turns per session                                                                                |
| **Budget**      | 100        | Total turn budget across all sessions                                                                    |
| **Algorithm**   | `frontier` | Scoring algorithm: `frontier`, `heatmap`, or `random`                                                    |
| **Temperament** | `random`   | Simulated user personality: `cooperative`, `neutral`, `frustrated`, `confused`, `skeptical`, or `random` |

Target specs can be generated from natural-language objectives using `forge simulation plan`, which translates goals into structured coverage targets based on the context graph structure.

### Simulation Bridge

The `forge simulation bridge` command combines scenario generation with multi-turn conversation execution. You describe what you want to test in natural language, and the bridge generates diverse scenario variations, runs each as a full conversation with an LLM-driven persona, and collects interaction insights after every turn for coverage tracking.

```bash
# Generate and run 5 scenarios testing cancellation handling
forge simulation bridge --service "Scheduling" --objective "test cancellation edge cases" --scenarios 5 --env staging
```

Each scenario includes a persona background, temperament (cooperative, frustrated, confused, skeptical, or neutral), and instructions that guide the simulated caller's behavior throughout the conversation. The bridge tracks which context graph states, tools, and dynamic behaviors were exercised across all scenarios, giving you coverage visibility without manually designing each test case.

The bridge also pulls interaction insights after each agent turn - the same detailed reasoning audit available for production calls - so you can see which memories were active, what state transitions occurred, and which tools were considered at every step of every scenario.

#### Result Persistence and Reports

Simulation bridge results are persisted locally across runs, enabling trend analysis and regression detection. After a run completes, you can generate summary reports with pass/fail counts, score distributions, and failure breakdowns. Comparing current results against previous runs shows whether a configuration change improved or degraded coverage.

Tag simulation scenarios for selective execution - for example, `forge simulation bridge --tag scheduling` runs only scheduling-related scenarios. Tags let you build a reusable test library that grows over time as you discover edge cases worth preserving.

## Changelog Command

The `forge changelog` command provides cross-entity change traceability - tracking what changed across agents, context graphs, behaviors, and metrics over time. This gives teams visibility into configuration drift without relying on external version control tooling.

## When to Use Agent Forge

* **Managing configurations across environments**: Keep staging and production in sync with a controlled promotion process.
* **Bulk updates**: Modify multiple agents, behaviors, or evaluation criteria in a single operation.
* **Scripted deployments**: Integrate Agent Forge into CI/CD pipelines for automated testing and deployment.
* **Audit and rollback**: Maintain a complete history of configuration changes with the ability to revert.
* **Data exploration**: Query workspace data, profile tables, and run analytics templates without leaving the CLI.
* **Building agents from scratch**: Use Platform API commands to create agents, context graphs, and services entirely from the CLI.
* **Coverage testing**: Run simulation tests that automatically explore unvisited states and edge cases.
* **Change traceability**: Track configuration changes across all entity types with the changelog command.

{% hint style="info" %}
Use the [Platform API developer guide](https://docs.amigo.ai/developer-guide/platform-api) for setup, authentication, and workspace configuration details. This reference page covers the Agent Forge command surface.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.amigo.ai/reference/agent-forge.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
