Conversations
Text-based agent conversations over REST and WebSocket - full CRUD lifecycle, persistent freeze-thaw, cross-channel continuity.
The conversations resource provides a unified view across all channels. Voice calls and text conversations are accessible through the same endpoints.
Conversations are first-class REST resources. You create them, send turns, list and inspect them, and close them when you are done. A separate WebSocket endpoint handles real-time streaming chat. Both run the same context graph engine, tools, and safety rules as voice calls, and both share the same conversation persistence - start over REST, resume over WebSocket, or the other way around.
One engine, many transports. REST, WebSocket, SMS, and WhatsApp all feed into the same actor. The agent does not know which transport delivered the message.
Endpoints
POST
/v1/{workspace_id}/conversations
Create a conversation (auto-greets by default)
GET
/v1/{workspace_id}/conversations
List conversations (paginated, filterable)
GET
/v1/{workspace_id}/conversations/{id}
Get conversation detail with turns and plan
POST
/v1/{workspace_id}/conversations/{id}/turns
Send a message; returns either a single JSON response or a typed SSE event stream depending on Accept
DELETE
/v1/{workspace_id}/conversations/{id}
Close a conversation
POST
/v1/{workspace_id}/sessions/start
Materialize an SMS, WhatsApp, or web text session for a known entity
WS
/v1/{workspace_id}/sessions/connect
Public bidirectional streaming for text conversations (subprotocol auth)
Architecture
Conversation Lifecycle
Active
Engine running, processing a turn right now
Frozen
Turn finished, state saved. Ready for the next turn or WebSocket reconnect.
Closed
Terminal - agent finished or you explicitly closed it. No more turns accepted (409).
After every REST turn, the conversation is automatically frozen. The next POST /{id}/turns thaws it, processes the message, and freezes again. This is invisible to callers - you just keep sending turns.
What Gets Saved
Plan
Natural-language summary of the conversation state, written by an LLM when turns are compressed. Survives platform upgrades without schema migration.
Turns
Last 200 verbatim messages (role, text, timestamp).
Cursor
Channel-specific position marker.
Session Timing (WebSocket)
Idle timeout
5 minutes
Max duration
1 hour
Compress on freeze
Yes
Completion Reasons
When a conversation ends, the reason field tells you why:
completed
Agent reached a terminal context graph state
idle_timeout
No messages within the idle window (WebSocket)
max_duration
Hit the one-hour cap (WebSocket)
client_stop
Client sent {"type": "stop"} (WebSocket)
error
Unrecoverable processing error
transport_error
Three consecutive transport failures
REST API
Create a Conversation
Creates a conversation record and returns it. The conversation starts in a usable state - send your first turn immediately after.
Request Body
service_id
string (UUID)
Yes
Which agent to talk to
entity_id
string (UUID)
No
Patient entity ID for world model context
auto_greet
boolean
No
When true (default), the agent's opening turn is produced as part of conversation creation and returned in turns. Set to false if the caller will send the first user message itself (for example, when replaying a transcript).
Response (201 Created)
201
Conversation created
503
Creation timed out (lock contention)
Auto-greet failure is non-fatal. If the agent cannot produce the opening turn (transient error, timeout, or misconfiguration), the conversation row is still created and returned with an empty turns array. Callers can recover by sending the first user message via POST /turns — no retry of the create call is needed.
Send a Turn
Send a message and get the agent's response. The conversation is thawed, the message runs through the reasoning engine, and the conversation is frozen again before the response returns.
The endpoint negotiates two response shapes by Accept header. The default is the synchronous JSON response below. Pass Accept: text/event-stream to receive token-by-token output and tool-call telemetry as a typed SSE event stream — see Streaming Turns.
Request Body
message
string
Yes
What the user said (1-10,000 chars)
Response (200 OK, application/json)
application/json)The response echoes your input, gives you the agent's output (can be multiple messages if the agent calls tools), and includes a snapshot of the conversation state.
200
Turn processed
404
Conversation not found
409
Conversation is closed, or another turn is already being processed (see Concurrency below)
503
Agent service unavailable
Streaming Turns (SSE)
For low-latency UIs that want to display tokens as they are generated and surface tool-call activity inline, request a Server-Sent Event stream on the same endpoint:
The server keeps the connection open and emits a typed sequence of events (event: lines) terminated by a done event. Concurrency, lock semantics, and persistence behavior are identical to the JSON path — partial responses are persisted on client disconnect and the conversation freezes normally.
Event Schema
Every event carries an event discriminator that maps to a typed payload. SDK consumers using openapi-typescript (or any spec-driven generator) receive a discriminated union (TurnStreamEvent) for compile-time exhaustiveness.
token
One agent response token
{ "text": string }
tool_call_started
Agent invoked a tool
{ "tool_name": string, "call_id": string, "input": string }
tool_call_completed
Tool returned
{ "tool_name": string, "call_id": string, "result": string, "succeeded": boolean }
thinking
Reasoning-tier classification
{ "tier": int, "tier_name": string }
message
Final assembled response
{ "role": string, "text": string }
done
Terminal event — turn complete
{ "conversation_id": string, "status": string, "turn_count": int }
error
Error mid-stream — connection ends
{ "message": string }
Example wire format
Streaming behavior
The stream completes when a
doneevent is received. Always treatdoneas the signal to close the reader; do not rely on connection-close alone.A mid-stream
errorevent is terminal — nodonefollows. Inspectmessagefor the failure reason and retry with backoff.If the client disconnects before
done, the server still finishes the turn and persists the partial agent response. The nextGET /{id}returns the completed turn.Standard HTTP status codes (404, 409, 503) are returned before the stream begins. Once the response body starts, status is 200 and failures arrive as
errorevents.
Concurrency
REST turns are serialized per conversation. Only one turn can be in flight at a time for a given conversation. Different conversations can process turns simultaneously — the lock is per-conversation, not global.
Send a turn while the previous turn is still processing
409 Conflict — "Conversation is already active".
Send a turn while a WebSocket session owns the conversation
409 Conflict — "Conversation is already active".
Send a turn after the previous turn returned
Works normally — thaw, process, freeze.
Send a turn to a closed conversation
409 Conflict — "Conversation is closed".
GET the conversation while a turn is processing
Works — read endpoints are never blocked.
Distinguishing "busy" from "closed"
Both return 409, but the response body differs:
Check the detail string to decide whether to retry or stop.
Retry strategy
Poll GET /v1/{workspace_id}/conversations/{id} and check status:
"active"— turn still processing. Wait and retry."frozen"— previous turn finished. Safe to send the next turn."closed"— conversation ended. Do not retry.
A simple approach: retry the turn with exponential backoff (1s, 2s, 4s) up to the 60-second turn timeout. Turns typically complete in 2–15 seconds depending on tool calls.
What if your HTTP client times out?
If your client gives up but the server is still processing, the turn runs to completion. The agent's response is generated, the conversation state is saved, and the conversation freezes normally. Your response is lost, but the state is not — the next GET /{id} will show the completed turn and the agent's reply in the turns array.
Lock release timing
The lock is released the instant the turn response is sent. If the server crashes mid-turn, the lock expires after ~120 seconds, after which new turns can be sent. There is no way to cancel an in-progress turn from the client side.
Do not fire-and-forget concurrent turns. Each turn must complete before the next one is sent. If you need real-time back-and-forth where multiple messages can be sent without waiting, use the WebSocket API instead — it queues messages and processes them in order.
List Conversations
status
string
(all)
Filter: active, frozen, or closed
limit
int
20
Results per page (1-100)
offset
int
0
Pagination offset
Response
Get Conversation Detail
Returns the full conversation including turns and the compressed plan (if frozen).
200
Found
404
Not found
Close a Conversation
Closes the conversation permanently. Subsequent turns return 409.
204
Closed
404
Not found or already closed
Example: Full REST Flow
WebSocket API
Two WebSocket endpoints expose the same wire protocol over different surfaces:
WS /v1/{workspace_id}/sessions/connect
Sec-WebSocket-Protocol: auth, <token> subprotocol header
Public workspace-scoped streaming for first-party clients (web/mobile apps, the Developer Console playground). Recommended for new integrations.
WS /agent/text-stream
?token=... query parameter
Internal/legacy entry point that powers the same engine. Functionally equivalent — kept for backward compatibility.
Both transports share the conversation store and lifecycle. A turn started on /sessions/connect is visible to subsequent REST GET /conversations/{id} calls and can be resumed on either endpoint with the same conversation_id.
Connect (public)
service_id
query
Yes
Which agent (UUID)
entity_id
query
Yes
Patient entity for world model context (UUID)
conversation_id
query
No
Resume an existing conversation (UUID). Omit for a fresh session.
tool_events
query
No
Emit tool_call_started / tool_call_completed frames. Default true.
credential
Sec-WebSocket-Protocol header
Yes
Two comma-separated values: the literal auth followed by an API key or JWT. The server echoes auth as the negotiated subprotocol on success.
Credentials never go in the URL. The /sessions/connect endpoint rejects tokens passed as query parameters to avoid leakage in proxies, browser history, and request logs. Browsers can deliver the subprotocol header by passing the token as the protocols argument to the WebSocket constructor:
Close codes
1000
Normal close
4001
Missing or malformed parameters (subprotocol header, query params, UUID format, token charset, token length)
4403
Authentication failed, service not found in workspace, or upstream rejected the upgrade. Same code is returned for all auth-class failures to prevent tenant enumeration.
4503
Authentication backend (JWKS) temporarily unreachable — retry with backoff
The server enforces a 7,200-second hard cap on a single WebSocket lifetime and a 120-second client-side idle timeout (no inbound frames). Reconnect with the same conversation_id to continue the session.
Connect (legacy)
Persistent bidirectional text chat. The server authenticates, boots the engine, and sends session_started. Then you talk.
token
string
Yes
JWT or API key
workspace_id
string
Yes
Workspace ID
service_id
string
Yes
Which agent
conversation_id
string
No
Resume a frozen conversation
entity_id
string
No
Patient entity ID for context
Wire Protocol
Client sends:
{"type": "message", "text": "..."}
Empty text is ignored
{"type": "stop"}
Server sends session_ended and closes
Server sends:
{"type": "session_started", "session_id": "...", "conversation_id": "..."}
First frame. Save conversation_id.
{"type": "typing"}
Show a typing indicator
{"type": "tool_call_started", "tool_name": "...", "call_id": "...", "input": {...}}
Agent started a tool call. Requires tool_events=true.
{"type": "tool_call_completed", "tool_name": "...", "call_id": "...", "result": "...", "succeeded": true}
Tool call finished. Requires tool_events=true.
{"type": "message", "text": "..."}
Agent's response
{"type": "error", "message": "..."}
Something broke. Connection stays open.
{"type": "session_ended", "reason": "..."}
Done. Socket closes after this.
Bad JSON gets {"type": "error", "message": "Invalid JSON"} without dropping the connection.
Message Queuing and Coalescing
Messages sent while the agent is still processing a previous message are queued and processed in order. There is no barge-in — the agent finishes its current response before starting the next one.
SMS and WhatsApp coalesce rapid-fire messages automatically. When multiple messages arrive before the agent starts its next turn, they are joined into a single turn (newline-separated) and the agent responds once to all of them. This prevents awkward split responses when a patient sends "Hi" then "I need to schedule an appointment" in quick succession — the agent sees both as one message and responds coherently.
WebSocket does not coalesce by default. Each message produces its own typing → message response cycle. If you send 3 messages while the agent is busy, you will receive 3 separate responses, in order.
Send a message while agent is responding
Queued. Processed after the current response completes.
Send multiple messages rapidly (SMS/WhatsApp)
Coalesced into one turn. One combined response.
Send multiple messages rapidly (WebSocket)
Each processed as its own turn, in order. One response per message.
Send a message during a tool call
Queued. Processed after the tool call and response complete.
Agent reaches terminal state while messages are queued
Session ends. Queued messages are discarded. You receive session_ended.
Disconnect while messages are queued
Unprocessed messages are discarded. The conversation freezes with whatever turns completed.
This is different from voice calls, where the patient's speech interrupts (barge-in) the agent. In text mode, every message waits its turn.
No delivery acknowledgment. The server does not send a confirmation when your message is received. You know a message was processed when you receive the agent's response. If you need guaranteed delivery tracking, use the REST API — its synchronous request-response model gives you an explicit success/failure for every turn.
Per-connection rate limit: 30 messages per 10 seconds. Exceeding the limit returns {"type": "error", "message": "Rate limit exceeded"} without dropping the connection.
Tool Call Events
When tool_events=true is passed as a query parameter on the WebSocket connection, the server emits tool_call_started and tool_call_completed frames when the agent invokes tools (e.g., searching appointments, checking insurance, looking up medications). These events arrive between typing and message frames.
tool_call_started
tool_call_startedtype
"tool_call_started"
Event type
tool_name
string
Name of the tool being called
call_id
string
Unique identifier for this invocation
input
object
Arguments passed to the tool
tool_call_completed
tool_call_completedtype
"tool_call_completed"
Event type
tool_name
string
Name of the tool
call_id
string
Same identifier as the corresponding started event
result
string
Tool output (JSON string or plain text)
succeeded
boolean
Whether the tool call succeeded
Example sequence
Multiple tool calls in a single turn produce paired started/completed events for each tool.
REST equivalent
For REST turns, pass ?include_tool_calls=true on POST /{id}/turns. Tool call details are returned in the tool_calls array of the response:
Close Codes
1000
Normal close
Nothing — session ended cleanly
4001
Missing params or bad token
Check token, workspace_id, service_id query params
4003
Token workspace mismatch
The token's workspace does not match the workspace_id param
4200
Engine init failed
Agent version not published, service misconfigured, or transient error. Retry once, then investigate.
4202
Reactivation response invalid
Server-side schema drift. Contact support.
4203
Conversation service unavailable
Materialization or reactivation backend is down. Retry with backoff.
4400
Invalid conversation_id format
Must be a valid UUID
4403
Conversation materialization forbidden
Workspace does not allow text conversations
4404
Conversation not found
conversation_id does not exist, belongs to a different workspace, or has a channel/entity mismatch
4409
Conversation already active
Another WebSocket or REST client owns this conversation. Wait for it to disconnect or freeze, then reconnect.
Greeting Behavior
New conversation
Agent generates and sends a greeting
Resumed (thawed)
No greeting - agent waits for you to speak
Example: JavaScript
Example: Resume
Outbound Text (SMS)
Start an outbound SMS conversation.
phone_to
string (E.164)
Yes
Patient phone number
phone_from
string (E.164)
Yes
Agent phone number (provisioned in workspace)
workspace_id
string
Yes
Workspace ID
service_id
string
Yes
Which agent
entity_id
string
No
Patient entity ID
surface_id
string
No
Surface to deliver inline
idempotency_key
string
No
Dedup key (5-minute cache)
Returns session_id, status (created or already_active), and conversation_id. Rate limited to 20 per workspace per 60 seconds. Returns 403 if the patient opted out of SMS.
Inbound SMS and WhatsApp sessions are automatic - no API call needed.
World Model Integration
When entity_id is provided, the agent starts with the patient's full world model projection: demographics, medications, allergies, conditions, appointments, insurance. The same clinical tools available during voice calls work here.
Signal Processing
Every turn goes through the same three-step pipeline:
message.inbound
Yes
Full reasoning engine pass
surface.submitted
Yes
Clears wait_for condition
review.approved
Yes
Clears wait_for
timeout.idle / timeout.max
Yes
Session ends
delivery.status
No
Logged only
surface.opened
No
Logged only
Transport failures on one turn do not kill the session. Three consecutive failures do.
Intelligence
Every conversation produces an intelligence record:
quality_score
0-100, penalty-based
turn_count
Total turns
completion_reason
Why it ended
final_state
Last context graph state
Workspace events: text.started (session created) and text.completed (session ended, with duration, turn count, reason, final state).
Switching Between REST and WebSocket
You can start a conversation over REST and resume it over WebSocket, or vice versa. The conversation ID is the same across both transports. A few things to know:
Can I switch mid-conversation?
Yes. Close the WebSocket (or let the REST turn complete), then use the other transport with the same conversation_id.
How quickly after a WebSocket disconnect can I send a REST turn?
Immediately in most cases. The lock is released during cleanup. If the server crashed, wait up to ~120 seconds for the lock to expire.
Do I need to re-authenticate?
Yes. Each connection or request authenticates independently.
Is the conversation state the same?
Yes. Both transports read from and write to the same conversation store. Turns from one transport are visible to the other.
Graceful Degradation
One transport send fails
Logged, session continues
Three consecutive failures
Session ends (transport_error)
Navigation timeout (>60s)
Turn skipped, session continues
State load fails
Fresh start, state not saved on end
Compression fails on freeze
Saved without plan - raw turns preserved
Engine init fails
WS: close 4200. REST: 503.
Server crashes mid-turn
Conversation freezes with last saved state. Lock expires in ~120s. Resume with next turn or reconnect.
Last updated
Was this helpful?

