# Services

A service binds an **agent**, a **context graph**, and **version sets** into a deployable unit. Services are the entry point for conversations - when a user calls a phone number or starts a chat, the platform resolves the service to determine which agent persona, conversation flow, and model configuration to use.

{% hint style="info" %}
**Classic API vs Platform API**: The Classic API also exposes service endpoints. The Platform API services are workspace-scoped and include richer configuration (tags, tool capacity). Use the Platform API for service setup; the Classic API for runtime conversation creation with the service ID.
{% endhint %}

## Key Concepts

* **Agent + Context Graph Binding**: Every service links to one agent and one context graph
* **Version Sets**: Named configurations that pin specific agent/context graph versions and LLM model preferences
* **Tags**: Key-value metadata (e.g., `channel:phone`, `preset:voice`) used for routing and validation
* **Tool Capacity**: Maximum concurrent tool executions per conversation (1-10, default 3)
* **Safety Filters**: Per-service toggle (`safety_filters_enabled`, default `true`). When disabled, conversation monitoring (monitor concepts, triage, accumulation) is bypassed while independent risk scoring remains active. Useful for non-clinical services or internal testing.

## Version Sets

Version sets enable safe iteration - like Git branches for your service configuration:

* **`edge`**: Always resolves to latest versions. Immutable - cannot be updated or deleted. Use for smoke tests only.
* **`release`**: The default version set used when clients don't specify one. Can be updated but not deleted.
* **Custom sets**: Create named sets (e.g., `personal-dev`, `test`, `preview`) for your promotion workflow.

{% @mermaid/diagram content="%%{init: {"theme": "base", "themeVariables": {"primaryColor": "#D4E2E7", "primaryTextColor": "#100F0F", "primaryBorderColor": "#083241", "lineColor": "#575452", "textColor": "#100F0F", "clusterBkg": "#F1EAE7", "clusterBorder": "#D7D2D0"}}}%%
flowchart TB
A\[personal-dev] -->|validate| B\[test]
B -->|run simulations| C\[preview]
C -->|UAT + promote| D\[release]
E\[edge] -.->|always latest| E" %}

The recommended promotion path: `personal → test → preview → release`

For full details on version set management, CLI commands, and promotion workflows, see [Version Sets & Promotion](https://docs.amigo.ai/developer-guide/operations/devops/version-sets-best-practices).

{% hint style="warning" %}
When upserting a version set, the API validates that pinned agent and context graph versions actually exist. Version set names must match `^[A-Za-z0-9_-]+$` (max 40 characters).
{% endhint %}

## Structured Actions

Context Graph action states use a **structured Action model**. Each action and exit condition can carry a `filler_hint` - a semantic direction that guides the voice pipeline's filler generation alongside emotional context and latency state.

```json
{
  "actions": [
    { "description": "Look up available appointment slots", "filler_hint": "Let me check what's available..." },
    { "description": "Confirm the appointment with the user", "filler_hint": null }
  ],
  "exit_conditions": [
    { "description": "Appointment confirmed", "next_state": "wrap_up", "filler_hint": "Great, I've got that booked." }
  ]
}
```

{% hint style="info" %}
**Filler hints** are not literal audio fillers. They are semantic directions - the voice pipeline uses them alongside real-time emotional and latency context to produce natural-sounding filler speech.
{% endhint %}

## Voice Configuration

`voice_config` is an optional field on the service model that controls how the voice pipeline behaves for that service. It covers latency tuning, filler behavior, response length, barge-in sensitivity, and tool access. If omitted, the service uses balanced defaults.

### Fields

**Latency**

| Field                 | Type                           | Default         | Description                                                                                                                 |
| --------------------- | ------------------------------ | --------------- | --------------------------------------------------------------------------------------------------------------------------- |
| `tts_model`           | `"sonic-turbo"` or `"sonic-3"` | `"sonic-turbo"` | TTS model selection. `sonic-turbo` targets 40ms time-to-first-audio; `sonic-3` targets 90ms with higher speech quality.     |
| `max_buffer_delay_ms` | `int` (200 to 1000)            | `500`           | How long the pipeline buffers text before sending to TTS. Lower values feel snappier but may produce choppier speech.       |
| `eager_eot_threshold` | `float` (0.0 to 1.0)           | -               | End-of-turn sensitivity. Higher values make the system more aggressive about detecting when the caller has stopped talking. |
| `eot_timeout_ms`      | `int`                          | -               | Hard timeout for end-of-turn detection in milliseconds.                                                                     |

**Fillers**

| Field                  | Type                                       | Default         | Description                                                                                                                                                          |
| ---------------------- | ------------------------------------------ | --------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `filler_style`         | `"backchannel"`, `"phrase"`, or `"silent"` | `"backchannel"` | `backchannel` produces short acknowledgments like "Mm" and "Yeah". `phrase` produces longer fillers like "Let me check on that". `silent` disables fillers entirely. |
| `filler_vocabulary`    | `list[str]`                                | -               | Custom filler words. Overrides the default vocabulary for the selected filler style.                                                                                 |
| `backchannel_delay_ms` | `int`                                      | `400`           | Delay before emitting a backchannel filler when navigation is skipped. Controls how quickly the agent acknowledges the caller.                                       |

**Response**

| Field                    | Type  | Default | Description                                                                                                          |
| ------------------------ | ----- | ------- | -------------------------------------------------------------------------------------------------------------------- |
| `max_response_sentences` | `int` | -       | Hard cap on response length in sentences. This is mechanically enforced (the pipeline truncates), not just prompted. |
| `max_response_words`     | `int` | -       | Hard cap on response length in words. Same mechanical enforcement.                                                   |

**Barge-in**

| Field                   | Type    | Default | Description                                                                                                                                                                            |
| ----------------------- | ------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `barge_in_min_speech_s` | `float` | `0.8`   | Minimum speech duration before the caller can interrupt the agent. Lower values make barge-in more sensitive. Setting this too low causes false interrupts from breath sounds or "um". |
| `barge_in_cooldown_s`   | `float` | `1.5`   | Cooldown period after a barge-in before another one can trigger. Prevents rapid-fire interruptions.                                                                                    |

**Tools**

| Field                  | Type   | Default | Description                                                                                                        |
| ---------------------- | ------ | ------- | ------------------------------------------------------------------------------------------------------------------ |
| `forward_call_enabled` | `bool` | `false` | Whether the `forward_call` tool is available. Opt-in because call forwarding has cost and compliance implications. |

### Presets

Three named presets cover common configurations:

| Preset              | TTS Model     | Buffer | Filler Style  | Backchannel Delay | Response Cap |
| ------------------- | ------------- | ------ | ------------- | ----------------- | ------------ |
| `ultra_low_latency` | `sonic-turbo` | 200ms  | `backchannel` | 400ms             | 1 sentence   |
| `balanced`          | defaults      | 500ms  | defaults      | defaults          | -            |
| `quality`           | `sonic-3`     | 500ms  | `phrase`      | defaults          | -            |

### Example

Include `voice_config` when creating or updating a service:

```json
{
  "name": "Intake Line",
  "agent_id": "abc123",
  "context_graph_id": "def456",
  "voice_config": {
    "tts_model": "sonic-turbo",
    "max_buffer_delay_ms": 200,
    "filler_style": "backchannel",
    "backchannel_delay_ms": 400,
    "max_response_sentences": 1,
    "barge_in_min_speech_s": 0.8,
    "forward_call_enabled": false
  }
}
```

### CLI

Apply a preset:

```bash
forge platform service voice-config <service_id> --preset ultra_low_latency
```

Set individual fields:

```bash
forge platform service voice-config <service_id> --body '{"filler_style": "silent", "max_response_sentences": 1}'
```

Read current config:

```bash
forge platform service voice-config <service_id> --get
```

### API

`PUT /v1/{workspace_id}/services/{service_id}` with the `voice_config` field in the request body. The field is merged with existing config, so you can update individual fields without resending the full object.

## Nav-Selected Emotion

The navigation LLM now outputs a structured format that includes an emotion tag alongside the state code and filler:

```
CODE,V,EMOTION,FILLER
```

For example: `a0,0,sympathetic,Mm`

* **CODE** is the state transition code from the context graph.
* **V** is the verbosity flag (0 or 1).
* **EMOTION** is applied to the TTS provider before the response starts streaming. This means the filler and the main response share the same emotional tone, replacing the old approach of inline SSML emotion tags.
* **FILLER** is the filler text (if any) to speak while the engage LLM generates the full response.

Available emotions: `friendly`, `sympathetic`, `calm`, `enthusiastic`, `serious`, `cheerful`, `curious`, `content`.

Because the emotion is set at the TTS context level rather than injected as SSML markup, the entire utterance has consistent prosody. This avoids the uncanny shifts that happened when SSML tags changed emotion mid-sentence.

## API Reference

* [Services](https://docs.amigo.ai/api-reference/readme/platform/services)
