External Events & Multi-Stream (WebSocket)
Amigo WebSockets support multiple information streams on a single connection so you can build rich, interactive apps. In addition to user text or audio, you can send external events (e.g., device telemetry, UI actions, page/navigation changes) that the agent incorporates in its next response.
This page focuses on how to publish external events alongside user input, and how they are associated with agent interactions.
Why Multi‑Stream?
Device information streams: battery/network/orientation updates for mobile assistants
UI interaction streams: button clicks, navigation, selection changes
Context updates: screen state, cart changes, presence/location signals
Real‑time apps: send events while a user is speaking (VAD or manual audio streaming)
Streams on the Connection
On a single WebSocket you can interleave these client → server messages:
client.new-text-message
withmessage_type: 'user-message'
(user text)client.new-audio-message
(user audio streaming) andclient.new-audio-message
withaudio: null
(end of audio)client.new-text-message
withmessage_type: 'external-event'
(external events)Control messages:
client.switch-vad-mode
,client.extend-timeout
,client.finish-conversation
,client.close-connection
And you will receive these server → client streams:
server.new-message
(text chunks or base64 audio chunks)server.interaction-complete
(marks the end of a turn)server.current-agent-action
(optional; controlled by query filter)VAD events:
server.vad-speech-started
,server.vad-speech-ended
,server.vad-speech-reset-zero
,server.vad-mode-switched
Association & Ordering
External events are timestamped and attached to the next interaction.
Any external events sent before or during a user’s input (text or audio) become context for that interaction.
After an interaction completes, the external‑event buffer is cleared.
You can also start an interaction using only an external event (no user text/audio).
Sequence Diagram
VAD Mode Sequence
Minimal VAD Mode Example (messages sent)
// 1) Enable VAD mode
ws.send(JSON.stringify({
type: 'client.switch-vad-mode',
vad_mode_on: true
}));
// 2) Continuously stream PCM audio chunks (16kHz mono)
function onPcmFrame(pcmBase64) {
ws.send(JSON.stringify({
type: 'client.new-audio-message',
audio: pcmBase64
}));
}
// 3) Send external events during speech; they attach to the same interaction
ws.send(JSON.stringify({
type: 'client.new-text-message',
message_type: 'external-event',
text: JSON.stringify({ event: 'ui.navigate', page: '/checkout' })
}));
ws.send(JSON.stringify({
type: 'client.new-text-message',
message_type: 'external-event',
text: JSON.stringify({ event: 'cart.add', sku: 'SKU-123' })
}));
// 4) Receive server.new-message chunks and server.interaction-complete as usual
ws.onmessage = (e) => {
const msg = JSON.parse(e.data);
if (msg.type === 'server.new-message') {
// handle text/audio chunk
} else if (msg.type === 'server.interaction-complete') {
// end of turn; external events above are associated with this interaction
}
};
Send External Events
Send external context as structured text alongside the conversation. We recommend JSON‑string payloads so your agent can parse event types and data.
// Example: send a device telemetry update
ws.send(JSON.stringify({
type: 'client.new-text-message',
message_type: 'external-event',
text: JSON.stringify({
event: 'device.update',
battery: 72,
network: 'wifi',
orientation: 'landscape'
})
}));
// Example: send a UI action
ws.send(JSON.stringify({
type: 'client.new-text-message',
message_type: 'external-event',
text: JSON.stringify({
event: 'ui.click',
target: 'checkout_button',
page: '/cart'
})
}));
Notes:
text
is a string; usingJSON.stringify(...)
keeps a consistent, parseable structure.The server records these as
external-event
messages with timestamps for the interaction.
Use With Audio Streaming
You can interleave external events while streaming audio. Events sent between the start and end of a user’s audio are attached to that same interaction.
// 1) Start audio stream (first chunk includes config)
ws.send(JSON.stringify({
type: 'client.new-audio-message',
audio: firstBase64PcmChunk,
audio_config: { format: 'pcm', sample_rate: 16000, sample_width: 2, n_channels: 1, frame_rate: 16000 }
}));
// 2) Interleave external events while the user is speaking
ws.send(JSON.stringify({
type: 'client.new-text-message',
message_type: 'external-event',
text: JSON.stringify({ event: 'ui.navigate', page: '/checkout' })
}));
// 3) Continue sending audio chunks
ws.send(JSON.stringify({ type: 'client.new-audio-message', audio: nextBase64PcmChunk }));
// 4) Signal end of audio
ws.send(JSON.stringify({ type: 'client.new-audio-message', audio: null }));
VAD mode is also supported. When client.switch-vad-mode
enables VAD, you continuously stream PCM audio; the server detects speech boundaries, and any external events sent during speech are still attached to that interaction.
Start With Only an External Event
Kick off an interaction without user text/audio by sending an external event as the first message:
ws.send(JSON.stringify({
type: 'client.new-text-message',
message_type: 'external-event',
text: 'ALERT: Payment gateway timeout rate spiking above threshold'
}));
Receive Agent Actions (Optional)
For interactive UIs, you can subscribe to agent action telemetry and filter what you receive. Use the current_agent_action_type
query parameter when connecting.
// Only receive tool-related actions
const filter = encodeURIComponent('^tool\\..*');
const url = `wss://api.amigo.ai/v1/${org}/conversation/converse_realtime?response_format=text¤t_agent_action_type=${filter}`;
const ws = new WebSocket(url, [`bearer.authorization.amigo.ai.${token}`]);
ws.onmessage = (e) => {
const msg = JSON.parse(e.data);
if (msg.type === 'server.current-agent-action') {
// e.g., show progress, update UI state
renderAgentAction(msg.current_agent_action);
}
};
Limits & Reliability
Messages/minute: 60 across all message types
Connection inactivity timeout: 30s (send
client.extend-timeout
every ~15s when idle)Max message size: 1 MB (e.g., for audio chunks)
One active connection per user/service
VAD requires PCM audio (MP3 not supported in VAD mode)
Best Practices
Structure external events as JSON strings with an
event
name and minimal payloadDebounce high‑frequency updates (e.g., device telemetry every 3–5s or on meaningful change)
Send events before or during the user’s turn so they’re applied to the next response
Use regional endpoints and PCM for lowest latency in voice flows
Troubleshooting
Common close codes:
3000
Unauthorized
Refresh credentials / token format (bearer.authorization.amigo.ai.{JWT}
)
3003
Forbidden
Check user permissions
3008
Timeout
Send client.extend-timeout
periodically when idle
4000
Bad Request
Validate message structure and types
4004
Not Found
Verify organization/service/conversation IDs
4009
Conflict
Only 1 connection per user/service
4015
Unsupported Media
Use PCM for VAD, check audio config
4029
Rate Limited
Backoff and reduce message frequency
Related
Last updated
Was this helpful?