commentsConversations

Text-based agent conversations over REST and WebSocket - full CRUD lifecycle, persistent freeze-thaw, cross-channel continuity.

circle-info

The conversations resource provides a unified view across all channels. Voice calls and text conversations are accessible through the same endpoints.

Conversations are first-class REST resources. You create them, send turns, list and inspect them, and close them when you are done. A separate WebSocket endpoint handles real-time streaming chat. Both run the same context graph engine, tools, and safety rules as voice calls, and both share the same conversation persistence - start over REST, resume over WebSocket, or the other way around.

circle-info

One engine, many transports. REST, WebSocket, SMS, and WhatsApp all feed into the same actor. The agent does not know which transport delivered the message.

Endpoints

Method
Path
Description

POST

/v1/{workspace_id}/conversations

Create a conversation (auto-greets by default)

GET

/v1/{workspace_id}/conversations

List conversations (paginated, filterable)

GET

/v1/{workspace_id}/conversations/{id}

Get conversation detail with turns and plan

POST

/v1/{workspace_id}/conversations/{id}/turns

Send a message; returns either a single JSON response or a typed SSE event stream depending on Accept

DELETE

/v1/{workspace_id}/conversations/{id}

Close a conversation

POST

/v1/{workspace_id}/sessions/start

Materialize an SMS, WhatsApp, or web text session for a known entity

WS

/v1/{workspace_id}/sessions/connect

Public bidirectional streaming for text conversations (subprotocol auth)

Architecture


Conversation Lifecycle

State
What it means

Active

Engine running, processing a turn right now

Frozen

Turn finished, state saved. Ready for the next turn or WebSocket reconnect.

Closed

Terminal - agent finished or you explicitly closed it. No more turns accepted (409).

After every REST turn, the conversation is automatically frozen. The next POST /{id}/turns thaws it, processes the message, and freezes again. This is invisible to callers - you just keep sending turns.

What Gets Saved

Field
What it contains

Plan

Natural-language summary of the conversation state, written by an LLM when turns are compressed. Survives platform upgrades without schema migration.

Turns

Last 200 verbatim messages (role, text, timestamp).

Cursor

Channel-specific position marker.

Session Timing (WebSocket)

Setting
Value

Idle timeout

5 minutes

Max duration

1 hour

Compress on freeze

Yes

Completion Reasons

When a conversation ends, the reason field tells you why:

Reason
What happened

completed

Agent reached a terminal context graph state

idle_timeout

No messages within the idle window (WebSocket)

max_duration

Hit the one-hour cap (WebSocket)

client_stop

Client sent {"type": "stop"} (WebSocket)

error

Unrecoverable processing error

transport_error

Three consecutive transport failures


REST API

Create a Conversation

Creates a conversation record and returns it. The conversation starts in a usable state - send your first turn immediately after.

Request Body

Field
Type
Required
Description

service_id

string (UUID)

Yes

Which agent to talk to

entity_id

string (UUID)

No

Patient entity ID for world model context

auto_greet

boolean

No

When true (default), the agent's opening turn is produced as part of conversation creation and returned in turns. Set to false if the caller will send the first user message itself (for example, when replaying a transcript).

Response (201 Created)

Status
When

201

Conversation created

503

Creation timed out (lock contention)

circle-info

Auto-greet failure is non-fatal. If the agent cannot produce the opening turn (transient error, timeout, or misconfiguration), the conversation row is still created and returned with an empty turns array. Callers can recover by sending the first user message via POST /turns — no retry of the create call is needed.


Send a Turn

Send a message and get the agent's response. The conversation is thawed, the message runs through the reasoning engine, and the conversation is frozen again before the response returns.

The endpoint negotiates two response shapes by Accept header. The default is the synchronous JSON response below. Pass Accept: text/event-stream to receive token-by-token output and tool-call telemetry as a typed SSE event stream — see Streaming Turns.

Request Body

Field
Type
Required
Description

message

string

Yes

What the user said (1-10,000 chars)

Response (200 OK, application/json)

The response echoes your input, gives you the agent's output (can be multiple messages if the agent calls tools), and includes a snapshot of the conversation state.

Status
When

200

Turn processed

404

Conversation not found

409

Conversation is closed, or another turn is already being processed (see Concurrency below)

503

Agent service unavailable


Streaming Turns (SSE)

For low-latency UIs that want to display tokens as they are generated and surface tool-call activity inline, request a Server-Sent Event stream on the same endpoint:

The server keeps the connection open and emits a typed sequence of events (event: lines) terminated by a done event. Concurrency, lock semantics, and persistence behavior are identical to the JSON path — partial responses are persisted on client disconnect and the conversation freezes normally.

Event Schema

Every event carries an event discriminator that maps to a typed payload. SDK consumers using openapi-typescript (or any spec-driven generator) receive a discriminated union (TurnStreamEvent) for compile-time exhaustiveness.

Event
When
Payload

token

One agent response token

{ "text": string }

tool_call_started

Agent invoked a tool

{ "tool_name": string, "call_id": string, "input": string }

tool_call_completed

Tool returned

{ "tool_name": string, "call_id": string, "result": string, "succeeded": boolean }

thinking

Reasoning-tier classification

{ "tier": int, "tier_name": string }

message

Final assembled response

{ "role": string, "text": string }

done

Terminal event — turn complete

{ "conversation_id": string, "status": string, "turn_count": int }

error

Error mid-stream — connection ends

{ "message": string }

Example wire format

Streaming behavior

  • The stream completes when a done event is received. Always treat done as the signal to close the reader; do not rely on connection-close alone.

  • A mid-stream error event is terminal — no done follows. Inspect message for the failure reason and retry with backoff.

  • If the client disconnects before done, the server still finishes the turn and persists the partial agent response. The next GET /{id} returns the completed turn.

  • Standard HTTP status codes (404, 409, 503) are returned before the stream begins. Once the response body starts, status is 200 and failures arrive as error events.


Concurrency

REST turns are serialized per conversation. Only one turn can be in flight at a time for a given conversation. Different conversations can process turns simultaneously — the lock is per-conversation, not global.

Scenario
What happens

Send a turn while the previous turn is still processing

409 Conflict"Conversation is already active".

Send a turn while a WebSocket session owns the conversation

409 Conflict"Conversation is already active".

Send a turn after the previous turn returned

Works normally — thaw, process, freeze.

Send a turn to a closed conversation

409 Conflict"Conversation is closed".

GET the conversation while a turn is processing

Works — read endpoints are never blocked.

Distinguishing "busy" from "closed"

Both return 409, but the response body differs:

Check the detail string to decide whether to retry or stop.

Retry strategy

Poll GET /v1/{workspace_id}/conversations/{id} and check status:

  • "active" — turn still processing. Wait and retry.

  • "frozen" — previous turn finished. Safe to send the next turn.

  • "closed" — conversation ended. Do not retry.

A simple approach: retry the turn with exponential backoff (1s, 2s, 4s) up to the 60-second turn timeout. Turns typically complete in 2–15 seconds depending on tool calls.

What if your HTTP client times out?

If your client gives up but the server is still processing, the turn runs to completion. The agent's response is generated, the conversation state is saved, and the conversation freezes normally. Your response is lost, but the state is not — the next GET /{id} will show the completed turn and the agent's reply in the turns array.

Lock release timing

The lock is released the instant the turn response is sent. If the server crashes mid-turn, the lock expires after ~120 seconds, after which new turns can be sent. There is no way to cancel an in-progress turn from the client side.

circle-exclamation

List Conversations

Parameter
Type
Default
Description

status

string

(all)

Filter: active, frozen, or closed

limit

int

20

Results per page (1-100)

offset

int

0

Pagination offset

Response


Get Conversation Detail

Returns the full conversation including turns and the compressed plan (if frozen).

Status
When

200

Found

404

Not found


Close a Conversation

Closes the conversation permanently. Subsequent turns return 409.

Status
When

204

Closed

404

Not found or already closed


Example: Full REST Flow


WebSocket API

Two WebSocket endpoints expose the same wire protocol over different surfaces:

Endpoint
Auth
When to use

WS /v1/{workspace_id}/sessions/connect

Sec-WebSocket-Protocol: auth, <token> subprotocol header

Public workspace-scoped streaming for first-party clients (web/mobile apps, the Developer Console playground). Recommended for new integrations.

WS /agent/text-stream

?token=... query parameter

Internal/legacy entry point that powers the same engine. Functionally equivalent — kept for backward compatibility.

Both transports share the conversation store and lifecycle. A turn started on /sessions/connect is visible to subsequent REST GET /conversations/{id} calls and can be resumed on either endpoint with the same conversation_id.

Connect (public)

Parameter
Location
Required
Description

service_id

query

Yes

Which agent (UUID)

entity_id

query

Yes

Patient entity for world model context (UUID)

conversation_id

query

No

Resume an existing conversation (UUID). Omit for a fresh session.

tool_events

query

No

Emit tool_call_started / tool_call_completed frames. Default true.

credential

Sec-WebSocket-Protocol header

Yes

Two comma-separated values: the literal auth followed by an API key or JWT. The server echoes auth as the negotiated subprotocol on success.

circle-exclamation

Close codes

Code
Meaning

1000

Normal close

4001

Missing or malformed parameters (subprotocol header, query params, UUID format, token charset, token length)

4403

Authentication failed, service not found in workspace, or upstream rejected the upgrade. Same code is returned for all auth-class failures to prevent tenant enumeration.

4503

Authentication backend (JWKS) temporarily unreachable — retry with backoff

The server enforces a 7,200-second hard cap on a single WebSocket lifetime and a 120-second client-side idle timeout (no inbound frames). Reconnect with the same conversation_id to continue the session.

Connect (legacy)

Persistent bidirectional text chat. The server authenticates, boots the engine, and sends session_started. Then you talk.

Parameter
Type
Required
Description

token

string

Yes

JWT or API key

workspace_id

string

Yes

Workspace ID

service_id

string

Yes

Which agent

conversation_id

string

No

Resume a frozen conversation

entity_id

string

No

Patient entity ID for context

Wire Protocol

Client sends:

Frame
Notes

{"type": "message", "text": "..."}

Empty text is ignored

{"type": "stop"}

Server sends session_ended and closes

Server sends:

Frame
When

{"type": "session_started", "session_id": "...", "conversation_id": "..."}

First frame. Save conversation_id.

{"type": "typing"}

Show a typing indicator

{"type": "tool_call_started", "tool_name": "...", "call_id": "...", "input": {...}}

Agent started a tool call. Requires tool_events=true.

{"type": "tool_call_completed", "tool_name": "...", "call_id": "...", "result": "...", "succeeded": true}

Tool call finished. Requires tool_events=true.

{"type": "message", "text": "..."}

Agent's response

{"type": "error", "message": "..."}

Something broke. Connection stays open.

{"type": "session_ended", "reason": "..."}

Done. Socket closes after this.

Bad JSON gets {"type": "error", "message": "Invalid JSON"} without dropping the connection.

Message Queuing and Coalescing

Messages sent while the agent is still processing a previous message are queued and processed in order. There is no barge-in — the agent finishes its current response before starting the next one.

SMS and WhatsApp coalesce rapid-fire messages automatically. When multiple messages arrive before the agent starts its next turn, they are joined into a single turn (newline-separated) and the agent responds once to all of them. This prevents awkward split responses when a patient sends "Hi" then "I need to schedule an appointment" in quick succession — the agent sees both as one message and responds coherently.

WebSocket does not coalesce by default. Each message produces its own typingmessage response cycle. If you send 3 messages while the agent is busy, you will receive 3 separate responses, in order.

Scenario
What happens

Send a message while agent is responding

Queued. Processed after the current response completes.

Send multiple messages rapidly (SMS/WhatsApp)

Coalesced into one turn. One combined response.

Send multiple messages rapidly (WebSocket)

Each processed as its own turn, in order. One response per message.

Send a message during a tool call

Queued. Processed after the tool call and response complete.

Agent reaches terminal state while messages are queued

Session ends. Queued messages are discarded. You receive session_ended.

Disconnect while messages are queued

Unprocessed messages are discarded. The conversation freezes with whatever turns completed.

This is different from voice calls, where the patient's speech interrupts (barge-in) the agent. In text mode, every message waits its turn.

circle-exclamation
circle-info

Per-connection rate limit: 30 messages per 10 seconds. Exceeding the limit returns {"type": "error", "message": "Rate limit exceeded"} without dropping the connection.

Tool Call Events

When tool_events=true is passed as a query parameter on the WebSocket connection, the server emits tool_call_started and tool_call_completed frames when the agent invokes tools (e.g., searching appointments, checking insurance, looking up medications). These events arrive between typing and message frames.

tool_call_started

Field
Type
Description

type

"tool_call_started"

Event type

tool_name

string

Name of the tool being called

call_id

string

Unique identifier for this invocation

input

object

Arguments passed to the tool

tool_call_completed

Field
Type
Description

type

"tool_call_completed"

Event type

tool_name

string

Name of the tool

call_id

string

Same identifier as the corresponding started event

result

string

Tool output (JSON string or plain text)

succeeded

boolean

Whether the tool call succeeded

Example sequence

Multiple tool calls in a single turn produce paired started/completed events for each tool.

REST equivalent

For REST turns, pass ?include_tool_calls=true on POST /{id}/turns. Tool call details are returned in the tool_calls array of the response:

Close Codes

Code
Meaning
What to do

1000

Normal close

Nothing — session ended cleanly

4001

Missing params or bad token

Check token, workspace_id, service_id query params

4003

Token workspace mismatch

The token's workspace does not match the workspace_id param

4200

Engine init failed

Agent version not published, service misconfigured, or transient error. Retry once, then investigate.

4202

Reactivation response invalid

Server-side schema drift. Contact support.

4203

Conversation service unavailable

Materialization or reactivation backend is down. Retry with backoff.

4400

Invalid conversation_id format

Must be a valid UUID

4403

Conversation materialization forbidden

Workspace does not allow text conversations

4404

Conversation not found

conversation_id does not exist, belongs to a different workspace, or has a channel/entity mismatch

4409

Conversation already active

Another WebSocket or REST client owns this conversation. Wait for it to disconnect or freeze, then reconnect.

Greeting Behavior

Situation
What happens

New conversation

Agent generates and sends a greeting

Resumed (thawed)

No greeting - agent waits for you to speak

Example: JavaScript

Example: Resume


Outbound Text (SMS)

Start an outbound SMS conversation.

Field
Type
Required
Description

phone_to

string (E.164)

Yes

Patient phone number

phone_from

string (E.164)

Yes

Agent phone number (provisioned in workspace)

workspace_id

string

Yes

Workspace ID

service_id

string

Yes

Which agent

entity_id

string

No

Patient entity ID

surface_id

string

No

Surface to deliver inline

idempotency_key

string

No

Dedup key (5-minute cache)

Returns session_id, status (created or already_active), and conversation_id. Rate limited to 20 per workspace per 60 seconds. Returns 403 if the patient opted out of SMS.

Inbound SMS and WhatsApp sessions are automatic - no API call needed.


World Model Integration

When entity_id is provided, the agent starts with the patient's full world model projection: demographics, medications, allergies, conditions, appointments, insurance. The same clinical tools available during voice calls work here.


Signal Processing

Every turn goes through the same three-step pipeline:

Signal
Turn?
What happens

message.inbound

Yes

Full reasoning engine pass

surface.submitted

Yes

Clears wait_for condition

review.approved

Yes

Clears wait_for

timeout.idle / timeout.max

Yes

Session ends

delivery.status

No

Logged only

surface.opened

No

Logged only

Transport failures on one turn do not kill the session. Three consecutive failures do.


Intelligence

Every conversation produces an intelligence record:

Field
Description

quality_score

0-100, penalty-based

turn_count

Total turns

completion_reason

Why it ended

final_state

Last context graph state

Workspace events: text.started (session created) and text.completed (session ended, with duration, turn count, reason, final state).


Switching Between REST and WebSocket

You can start a conversation over REST and resume it over WebSocket, or vice versa. The conversation ID is the same across both transports. A few things to know:

Question
Answer

Can I switch mid-conversation?

Yes. Close the WebSocket (or let the REST turn complete), then use the other transport with the same conversation_id.

How quickly after a WebSocket disconnect can I send a REST turn?

Immediately in most cases. The lock is released during cleanup. If the server crashed, wait up to ~120 seconds for the lock to expire.

Do I need to re-authenticate?

Yes. Each connection or request authenticates independently.

Is the conversation state the same?

Yes. Both transports read from and write to the same conversation store. Turns from one transport are visible to the other.

Graceful Degradation

Failure
What happens

One transport send fails

Logged, session continues

Three consecutive failures

Session ends (transport_error)

Navigation timeout (>60s)

Turn skipped, session continues

State load fails

Fresh start, state not saved on end

Compression fails on freeze

Saved without plan - raw turns preserved

Engine init fails

WS: close 4200. REST: 503.

Server crashes mid-turn

Conversation freezes with last saved state. Lock expires in ~120s. Resume with next turn or reconnect.

Last updated

Was this helpful?