Clinical Copilot
Real-time clinical copilot WebSocket for streaming encounter documentation, SOAP notes, ICD-10 coding, and clinical alerts from browser audio.
The clinical copilot WebSocket streams browser microphone audio to the platform, which transcribes it in real time, updates encounter documentation, and pushes structured results back over the same connection. The result is a live encounter documentation system: SOAP notes build incrementally, ICD-10 codes are suggested as diagnoses emerge, and safety alerts surface during the encounter.
You can connect directly to the WebSocket for a custom clinical documentation frontend. Browser applications should create sessions through their own backend, pass only the needed connection values to the browser, then stream raw PCM16 audio.
Conceptual overview. For how clinical documentation fits into the platform's channel architecture, encounter lifecycle, and post-encounter automation, see Clinical Copilot in the conceptual docs.
Architecture
The clinical copilot is a documentation stream, not a conversational voice agent. It observes encounter audio and emits structured documentation events; it does not generate spoken responses to the patient.
The flow for each spoken utterance is:
Browser sends PCM16 audio over WebSocket
Platform emits interim and finalized transcript events
Platform updates structured encounter documentation
Structured results are pushed back to the client as JSON messages
Encounter state remains queryable through the Data & World Model API
Speaker attribution can identify clinician and patient transcript segments without manual tagging. Speaker identity helps interpret clinical content:
Clinician (prescriber): medication mentions are treated as prescription intent, triggering allergy and interaction checks
Patient (reporter): medication mentions are treated as current medication reports, updating the medication list
Staff (observer): measurements and observations are documented as objective data
Connection
Current endpoint:
workspace_id
Yes
Workspace UUID
service_id
Yes
Service ID for the workspace's configured clinical copilot service
patient_entity_id
No
World model entity ID for the patient. When provided, loads patient context and enables clinical cross-referencing.
practitioner_entity_id
No
Entity ID for the practitioner conducting the encounter.
token
Yes
JWT or API key. Can also be passed via the Sec-WebSocket-Protocol subprotocol header as auth, <token>.
Browser Integration Pattern
For browser-based clinical documentation, keep workspace credentials and service discovery on your backend. The backend should validate the signed-in clinician, choose the workspace's clinical copilot service, and return only the short-lived WebSocket URL or connection values needed by the browser.
Documentation behavior and medical vocabulary are not sent as WebSocket query parameters. Workspace-level behavior is configured through Copilot Settings; runtime vocabulary is resolved from the configured workspace and encounter context.
Authentication
Two auth methods are supported:
Query parameter: pass the token as
?token=<jwt_or_api_key>Subprotocol header: set
Sec-WebSocket-Protocol: auth, <token>in the WebSocket handshake
The token must be scoped to the specified workspace_id. Workspace API keys and identity JWTs are both accepted.
Close Codes
4000
Missing workspace_id or service_id, or invalid workspace_id format
4001
Missing or invalid authentication token
4003
Token not authorized for the specified workspace
Message Flow
Client Messages
The client sends two types of messages: binary audio frames and JSON control messages.
Audio Frames
Send raw PCM16 mono audio as binary WebSocket frames. Default sample rate is 16kHz. To use a different sample rate, send an audio_config message first.
Browser clients typically capture microphone audio through an AudioWorklet, emit short PCM16 frames, and send an audio_config message if the browser audio context is not running at 16kHz.
Control Messages (JSON)
audio_config
sample_rate (integer)
Set the input audio sample rate. Send before streaming audio if not using 16kHz.
mute
-
Pause audio processing. Audio frames are still accepted but not transcribed.
unmute
-
Resume audio processing.
stop
-
End the session gracefully. Triggers cleanup and session_end event.
Server Events
All server messages are JSON objects with a type field.
session_start
Sent immediately after connection is established and the encounter is created.
session_id
string
Unique session identifier for this copilot stream
encounter_entity_id
string (UUID)
World model entity ID for the encounter. Use this to query encounter state via the Data & World Model API.
transcript
A finalized transcript segment.
text
string
Transcribed speech
is_final
boolean
Always true for finalized transcripts
timestamp
float
Unix timestamp
speaker
string
Speaker attribution: "clinician" or "patient"
interim_transcript
Partial transcript while the speaker is still talking. Useful for showing live text before a final transcript segment is emitted.
scribe_result
Clinical documentation results after processing a transcript segment.
soap_updates
array
SOAP note updates. Each has section (subjective, objective, assessment, plan), content, and append (true to add, false to replace).
icd10_suggestions
array
ICD-10 code suggestions. Each entry has codes array with code, description, and confidence (0-1).
alerts
array
Clinical safety alerts. Each has severity (info, warning, critical), title, and message.
entities_discovered
array
Clinical entities extracted. Each has label, type (Symptom, Diagnosis, Medication, Allergy, Procedure, Lab, Vital), and optional status (confirmed, flagged, conflict).
warning
Non-fatal issue that the client should surface to the user.
error
An error occurred during processing. The session may continue after non-fatal errors.
audio_processing_failed
Fatal
Audio could not be processed. Session will end.
analysis_timeout
Non-fatal
Documentation analysis timed out for this transcript segment. Next segment will retry.
processing_error
Non-fatal
Unexpected error during analysis. Session continues.
heartbeat
Lightweight application-level keepalive emitted during long sessions.
Clients can ignore heartbeats, but they are useful for detecting stalled connections in browser UIs and monitoring.
session_end
Sent when the session ends (client sent stop, or connection closed).
Structured Outputs
The copilot emits structured documentation updates based on each finalized transcript segment. Workspace settings control which documentation and decision-support capabilities are active.
SOAP updates
Updates Subjective, Objective, Assessment, and Plan sections
ICD-10 suggestions
Suggests diagnosis codes with confidence scores
Clinical alerts
Surfaces safety and decision-support alerts
Clinical entities
Extracts medications, symptoms, diagnoses, vitals, procedures, and related findings
Post-encounter documentation
Supports note polishing, order preparation, and final review workflows when enabled
Encounter Entity
Each copilot session creates an encounter entity in the world model. The encounter accumulates all clinical intelligence from the session:
SOAP notes (incremental updates, then polished prose)
ICD-10 codes (suggested, approved, rejected)
Clinical alerts and safety flags
Clinical entities (medications, symptoms, diagnoses, vitals, procedures)
Encounter metadata (provider, patient, timestamps, duration)
Query the encounter entity via the Data & World Model API using the encounter_entity_id from the session_start event.
Patient Context
When patient_entity_id is provided, the platform loads the patient's full context from the world model before starting analysis:
Demographics, active conditions, current medications, allergies
Recent lab results, insurance details, encounter history
Preferred language (used to optimize transcription accuracy)
Patient context enables cross-referencing: mentioned medications are checked against the allergy list, new prescriptions against the current medication set, and care gaps are surfaced based on clinical history.
Without a patient entity, the copilot still transcribes and generates SOAP notes, but clinical cross-referencing and safety alerts are limited to what appears in the current encounter.
Reconnection
If the WebSocket disconnects unexpectedly, the encounter entity persists in the world model. To resume:
Open a new WebSocket connection with the same
workspace_idandservice_idThe previous encounter's data is preserved - query it via the encounter entity ID
A new session creates a new encounter entity
The platform does not support resuming an in-progress session on the same encounter. Each connection creates a fresh encounter. For long encounters, consider the encounter entity as the durable state and the WebSocket as a transient audio transport.
Example: Minimal Client
Permissions
The WebSocket accepts a workspace-scoped API key or identity JWT. The token must be valid for the workspace_id in the connection URL. Patient context loading and encounter writes happen server-side under the workspace authorization attached to the token.
Use the workspace's configured clinical copilot service ID when opening a stream.
API Reference
Clinical Copilot (conceptual docs)
Data & World Model (encounter entity queries)
Scribe (recording transcription and physician review)
Last updated
Was this helpful?

