# External Events & Multi-Stream (WebSocket)

Amigo WebSockets support multiple information streams on a single connection, so you can build rich, interactive apps. In addition to user text or audio, you can send external events (for example, device telemetry, UI actions, page or navigation changes) that the agent incorporates into its next response.

This page focuses on how to publish external events alongside user input, and how they are associated with agent interactions.

## Why Multi-Stream?

* Device information streams: battery, network, and orientation updates for mobile assistants.
* UI interaction streams: button clicks, navigation, selection changes.
* Context updates: screen state, cart changes, presence or location signals.
* Real-time apps: send events while a user is speaking (VAD or manual audio streaming).

## Streams on the Connection

On a single WebSocket you can interleave these client → server messages:

* `client.new-text-message` with `message_type: 'user-message'` (user text)
* `client.new-audio-message` (user audio streaming) and `client.new-audio-message` with `audio: null` (end of audio)
* `client.new-text-message` with `message_type: 'external-event'` (external events)
* Control messages: `client.switch-vad-mode`, `client.extend-timeout`, `client.finish-conversation`, `client.close-connection`

And you will receive these server → client streams:

* `server.new-message` (text chunks or base64 audio chunks)
* `server.interaction-complete` (marks the end of a turn)
* `server.current-agent-action` (optional; controlled by query filter)
* VAD events: `server.vad-speech-started`, `server.vad-speech-ended`, `server.vad-speech-reset-zero`, `server.vad-mode-switched`

## Association and Ordering

* External events are timestamped and attached to the next interaction.
* Any external events sent before or during a user's input (text or audio) become context for that interaction.
* After an interaction completes, the external-event buffer is cleared.
* You can also start an interaction using only an external event (no user text or audio).

### Sequence Diagram

{% @mermaid/diagram content="%%{init: {"theme": "base", "themeVariables": {"actorBkg": "#083241", "actorTextColor": "#FFFFFF", "actorBorder": "#083241", "signalColor": "#575452", "signalTextColor": "#100F0F", "labelBoxBkgColor": "#F1EAE7", "labelBoxBorderColor": "#D7D2D0", "labelTextColor": "#100F0F", "loopTextColor": "#100F0F", "noteBkgColor": "#F1EAE7", "noteBorderColor": "#D7D2D0", "noteTextColor": "#100F0F", "activationBkgColor": "#E8E2EB", "activationBorderColor": "#083241", "altSectionBkgColor": "#F1EAE7", "altSectionColor": "#100F0F"}}}%%
sequenceDiagram
autonumber
participant C as Client (App)
participant S as Server (WebSocket)

Note over C,S: External events sent before/during input<br/>attach to the next interaction

%% Interleave external events with audio
C->>S: client.new-audio-message<br/>{ audio\_config + first chunk }
C->>S: client.new-text-message<br/>message\_type=external-event<br/>{ event: 'ui.click', target: 'checkout\_button' }
C->>S: client.new-audio-message { next chunk }
C->>S: client.new-text-message<br/>message\_type=external-event<br/>{ event: 'ui.navigate', page: '/checkout' }
C->>S: client.new-audio-message { audio: null }

%% Agent response for that interaction
S-->>C: server.new-message (text/audio chunks)
S-->>C: server.interaction-complete { interaction\_id }

Note over S: External events above are recorded<br/>and associated with this interaction" %}

### VAD Mode Sequence

{% @mermaid/diagram content="%%{init: {"theme": "base", "themeVariables": {"actorBkg": "#083241", "actorTextColor": "#FFFFFF", "actorBorder": "#083241", "signalColor": "#575452", "signalTextColor": "#100F0F", "labelBoxBkgColor": "#F1EAE7", "labelBoxBorderColor": "#D7D2D0", "labelTextColor": "#100F0F", "loopTextColor": "#100F0F", "noteBkgColor": "#F1EAE7", "noteBorderColor": "#D7D2D0", "noteTextColor": "#100F0F", "activationBkgColor": "#E8E2EB", "activationBorderColor": "#083241", "altSectionBkgColor": "#F1EAE7", "altSectionColor": "#100F0F"}}}%%
sequenceDiagram
autonumber
participant C as Client (App)
participant S as Server (WebSocket)

C->>S: client.switch-vad-mode { vad\_mode\_on: true }
S-->>C: server.vad-mode-switched { current\_vad\_mode\_on: true }

rect rgba(0,0,0,0.03)
loop Continuous upstream audio (PCM)
C-->>S: client.new-audio-message { pcm chunk }
end
end

S-->>C: server.vad-speech-started { ts\_start }
Note over C,S: Send external events during speech to attach<br/>to the same interaction
C->>S: client.new-text-message<br/>message\_type=external-event<br/>{ event: 'ui.navigate', page: '/checkout' }
C->>S: client.new-text-message<br/>message\_type=external-event<br/>{ event: 'cart.add', sku: 'SKU-123' }
S-->>C: server.vad-speech-ended { ts\_end }

par Agent generates response
S-->>C: server.new-message (text/audio chunks)
and Turn complete
S-->>C: server.interaction-complete { interaction\_id }
end

Note over S: External events sent between<br/>speech start/end are associated<br/>with this interaction

%% New: Interruption with external event
Note over C,S: ==== Interruption Scenario ====
C->>S: client.new-text-message<br/>message\_type=external-event<br/>start\_interaction=true<br/>{ event: 'critical.alert' }

alt Agent hasn't detected user speaking
Note over S: Interrupts existing<br/>interaction immediately
S-->>C: server.new-message (response to event)
S-->>C: server.interaction-complete
else User is speaking (agent has detected)
Note over S: Waits until user<br/>finishes speaking
S-->>C: server.vad-speech-ended
S-->>C: server.new-message (response to event)
S-->>C: server.interaction-complete
end" %}

### Minimal VAD Mode Example (messages sent)

```javascript
// 1) Enable VAD mode
ws.send(JSON.stringify({
  type: 'client.switch-vad-mode',
  vad_mode_on: true
}));

// 2) Continuously stream PCM audio chunks (16kHz mono)
function onPcmFrame(pcmBase64) {
  ws.send(JSON.stringify({
    type: 'client.new-audio-message',
    audio: pcmBase64
  }));
}

// 3) Send external events during speech; they attach to the same interaction
ws.send(JSON.stringify({
  type: 'client.new-text-message',
  message_type: 'external-event',
  text: JSON.stringify({ event: 'ui.navigate', page: '/checkout' })
}));

ws.send(JSON.stringify({
  type: 'client.new-text-message',
  message_type: 'external-event',
  text: JSON.stringify({ event: 'cart.add', sku: 'SKU-123' })
}));

// 4) Receive server.new-message chunks and server.interaction-complete as usual
ws.onmessage = (e) => {
  const msg = JSON.parse(e.data);
  if (msg.type === 'server.new-message') {
    // handle text/audio chunk
  } else if (msg.type === 'server.interaction-complete') {
    // end of turn; external events above are associated with this interaction
  }
};
```

## Send External Events

Send external context as structured text alongside the conversation. We recommend JSON-string payloads so your agent can parse event types and data.

```javascript
// Example: send a device telemetry update
ws.send(JSON.stringify({
  type: 'client.new-text-message',
  message_type: 'external-event',
  text: JSON.stringify({
    event: 'device.update',
    battery: 72,
    network: 'wifi',
    orientation: 'landscape'
  })
}));

// Example: send a UI action
ws.send(JSON.stringify({
  type: 'client.new-text-message',
  message_type: 'external-event',
  text: JSON.stringify({
    event: 'ui.click',
    target: 'checkout_button',
    page: '/cart'
  })
}));
```

Notes:

* `text` is a string. Using `JSON.stringify(...)` keeps a consistent, parseable structure.
* The server records these as `external-event` messages with timestamps for the interaction.

## Use With Audio Streaming

You can interleave external events while streaming audio. Events sent between the start and end of a user's audio are attached to that same interaction.

```javascript
// 1) Start audio stream (first chunk includes config)
ws.send(JSON.stringify({
  type: 'client.new-audio-message',
  audio: firstBase64PcmChunk,
  audio_config: { format: 'pcm', sample_rate: 16000, sample_width: 2, n_channels: 1, frame_rate: 16000 }
}));

// 2) Interleave external events while the user is speaking
ws.send(JSON.stringify({
  type: 'client.new-text-message',
  message_type: 'external-event',
  text: JSON.stringify({ event: 'ui.navigate', page: '/checkout' })
}));

// 3) Continue sending audio chunks
ws.send(JSON.stringify({ type: 'client.new-audio-message', audio: nextBase64PcmChunk }));

// 4) Signal end of audio
ws.send(JSON.stringify({ type: 'client.new-audio-message', audio: null }));
```

VAD mode is also supported. When `client.switch-vad-mode` enables VAD, you continuously stream PCM audio. The server detects speech boundaries, and any external events sent during speech are still attached to that interaction.

## Start With Only an External Event

Kick off an interaction without user text or audio by sending an external event as the first message:

```javascript
ws.send(JSON.stringify({
  type: 'client.new-text-message',
  message_type: 'external-event',
  text: 'ALERT: Payment gateway timeout rate spiking above threshold',
  start_interaction: true  // Optional: explicitly start a new interaction
}));
```

### Interrupting Interactions in VAD Mode

When in VAD mode, external events with `start_interaction: true` can interrupt ongoing conversations:

```javascript
// VAD mode enabled, agent is responding...

// Send urgent external event
ws.send(JSON.stringify({
  type: 'client.new-text-message',
  message_type: 'external-event',
  text: JSON.stringify({
    event: 'system.alert',
    severity: 'critical',
    message: 'Database connection lost'
  }),
  start_interaction: true  // Interrupts current interaction
}));

// Behavior depends on user speech state:
// - If agent hasn't detected user speaking: Interrupts any existing interaction, starts new interaction immediately
// - If user is speaking (agent has detected): Agent waits for speech to end, then triggers new interaction
```

**Important Notes:**

* Without `start_interaction: true`, external events are buffered and attached to the next natural interaction.
* With `start_interaction: true` in VAD mode:
  * Existing agent responses are interrupted when the user is not speaking.
  * The agent respects user speech, waiting until the user finishes before triggering the new interaction.
* This lets critical events take precedence while keeping natural conversation flow.

## Receive Agent Actions (Optional)

For interactive UIs, you can subscribe to agent action telemetry and filter what you receive. Use the `current_agent_action_type` query parameter when connecting:

```javascript
// Only receive tool-related actions
const filter = encodeURIComponent('^tool\\..*');
const url = `wss://api.amigo.ai/v1/${org}/conversation/converse_realtime?response_format=text&current_agent_action_type=${filter}`;
const ws = new WebSocket(url, [`bearer.authorization.amigo.ai.${token}`]);

ws.onmessage = (e) => {
  const msg = JSON.parse(e.data);
  if (msg.type === 'server.current-agent-action') {
    // e.g., show progress, update UI state
    renderAgentAction(msg.current_agent_action);
  }
};
```

## Limits and Reliability

* Messages per minute: 60 across all message types.
* Connection inactivity timeout: 30s. Send `client.extend-timeout` every \~15s when idle.
* Max message size: 1 MB (for example, for audio chunks).
* One active connection per user/service.
* VAD requires PCM audio. MP3 is not supported in VAD mode.

## Best Practices

* Structure external events as JSON strings with an `event` name and minimal payload.
* Debounce high-frequency updates (for example, device telemetry every 3-5s or on meaningful change).
* Send events before or during the user's turn so they are applied to the next response.
* Use regional endpoints and PCM for the lowest latency in voice flows.

<details>

<summary>Troubleshooting: WebSocket close codes</summary>

### Troubleshooting

Common close codes:

| Code | Meaning           | Typical fix                                                                       |
| ---- | ----------------- | --------------------------------------------------------------------------------- |
| 3000 | Unauthorized      | Refresh credentials or check token format (`bearer.authorization.amigo.ai.{JWT}`) |
| 3003 | Forbidden         | Check user permissions                                                            |
| 3008 | Timeout           | Send `client.extend-timeout` periodically when idle                               |
| 4000 | Bad Request       | Validate message structure and types                                              |
| 4004 | Not Found         | Verify organization, service, and conversation IDs                                |
| 4009 | Conflict          | Only one connection per user or service is allowed                                |
| 4015 | Unsupported Media | Use PCM for VAD and check audio config                                            |
| 4029 | Rate Limited      | Back off and reduce message frequency                                             |

</details>

## Related

* [Real-time Voice (WebSocket)](https://docs.amigo.ai/developer-guide/classic-api/core-api/conversations/conversations-realtime)
* [Conversation Events (HTTP)](https://docs.amigo.ai/developer-guide/classic-api/core-api/conversations/conversations-events)
