notes-medicalClinical Copilot

Real-time clinical copilot WebSocket for streaming encounter documentation, SOAP notes, ICD-10 coding, and clinical alerts from browser audio.

The clinical copilot WebSocket streams browser microphone audio to the platform, which transcribes it in real time, updates encounter documentation, and pushes structured results back over the same connection. The result is a live encounter documentation system: SOAP notes build incrementally, ICD-10 codes are suggested as diagnoses emerge, and safety alerts surface during the encounter.

You can connect directly to the WebSocket for a custom clinical documentation frontend. Browser applications should create sessions through their own backend, pass only the needed connection values to the browser, then stream raw PCM16 audio.

circle-info

Conceptual overview. For how clinical documentation fits into the platform's channel architecture, encounter lifecycle, and post-encounter automation, see Clinical Copilotarrow-up-right in the conceptual docs.

Architecture

The clinical copilot is a documentation stream, not a conversational voice agent. It observes encounter audio and emits structured documentation events; it does not generate spoken responses to the patient.

The flow for each spoken utterance is:

  1. Browser sends PCM16 audio over WebSocket

  2. Platform emits interim and finalized transcript events

  3. Platform updates structured encounter documentation

  4. Structured results are pushed back to the client as JSON messages

  5. Encounter state remains queryable through the Data & World Model API

Speaker attribution can identify clinician and patient transcript segments without manual tagging. Speaker identity helps interpret clinical content:

  • Clinician (prescriber): medication mentions are treated as prescription intent, triggering allergy and interaction checks

  • Patient (reporter): medication mentions are treated as current medication reports, updating the medication list

  • Staff (observer): measurements and observations are documented as objective data

Connection

Current endpoint:

Parameter
Required
Description

workspace_id

Yes

Workspace UUID

service_id

Yes

Service ID for the workspace's configured clinical copilot service

patient_entity_id

No

World model entity ID for the patient. When provided, loads patient context and enables clinical cross-referencing.

practitioner_entity_id

No

Entity ID for the practitioner conducting the encounter.

token

Yes

JWT or API key. Can also be passed via the Sec-WebSocket-Protocol subprotocol header as auth, <token>.

Browser Integration Pattern

For browser-based clinical documentation, keep workspace credentials and service discovery on your backend. The backend should validate the signed-in clinician, choose the workspace's clinical copilot service, and return only the short-lived WebSocket URL or connection values needed by the browser.

Documentation behavior and medical vocabulary are not sent as WebSocket query parameters. Workspace-level behavior is configured through Copilot Settings; runtime vocabulary is resolved from the configured workspace and encounter context.

Authentication

Two auth methods are supported:

  1. Query parameter: pass the token as ?token=<jwt_or_api_key>

  2. Subprotocol header: set Sec-WebSocket-Protocol: auth, <token> in the WebSocket handshake

The token must be scoped to the specified workspace_id. Workspace API keys and identity JWTs are both accepted.

Close Codes

Code
Reason

4000

Missing workspace_id or service_id, or invalid workspace_id format

4001

Missing or invalid authentication token

4003

Token not authorized for the specified workspace

Message Flow

Client Messages

The client sends two types of messages: binary audio frames and JSON control messages.

Audio Frames

Send raw PCM16 mono audio as binary WebSocket frames. Default sample rate is 16kHz. To use a different sample rate, send an audio_config message first.

Browser clients typically capture microphone audio through an AudioWorklet, emit short PCM16 frames, and send an audio_config message if the browser audio context is not running at 16kHz.

Control Messages (JSON)

Type
Fields
Description

audio_config

sample_rate (integer)

Set the input audio sample rate. Send before streaming audio if not using 16kHz.

mute

-

Pause audio processing. Audio frames are still accepted but not transcribed.

unmute

-

Resume audio processing.

stop

-

End the session gracefully. Triggers cleanup and session_end event.

Server Events

All server messages are JSON objects with a type field.

session_start

Sent immediately after connection is established and the encounter is created.

Field
Type
Description

session_id

string

Unique session identifier for this copilot stream

encounter_entity_id

string (UUID)

World model entity ID for the encounter. Use this to query encounter state via the Data & World Model API.

transcript

A finalized transcript segment.

Field
Type
Description

text

string

Transcribed speech

is_final

boolean

Always true for finalized transcripts

timestamp

float

Unix timestamp

speaker

string

Speaker attribution: "clinician" or "patient"

interim_transcript

Partial transcript while the speaker is still talking. Useful for showing live text before a final transcript segment is emitted.

scribe_result

Clinical documentation results after processing a transcript segment.

Field
Type
Description

soap_updates

array

SOAP note updates. Each has section (subjective, objective, assessment, plan), content, and append (true to add, false to replace).

icd10_suggestions

array

ICD-10 code suggestions. Each entry has codes array with code, description, and confidence (0-1).

alerts

array

Clinical safety alerts. Each has severity (info, warning, critical), title, and message.

entities_discovered

array

Clinical entities extracted. Each has label, type (Symptom, Diagnosis, Medication, Allergy, Procedure, Lab, Vital), and optional status (confirmed, flagged, conflict).

warning

Non-fatal issue that the client should surface to the user.

error

An error occurred during processing. The session may continue after non-fatal errors.

Code
Severity
Description

audio_processing_failed

Fatal

Audio could not be processed. Session will end.

analysis_timeout

Non-fatal

Documentation analysis timed out for this transcript segment. Next segment will retry.

processing_error

Non-fatal

Unexpected error during analysis. Session continues.

heartbeat

Lightweight application-level keepalive emitted during long sessions.

Clients can ignore heartbeats, but they are useful for detecting stalled connections in browser UIs and monitoring.

session_end

Sent when the session ends (client sent stop, or connection closed).

Structured Outputs

The copilot emits structured documentation updates based on each finalized transcript segment. Workspace settings control which documentation and decision-support capabilities are active.

Output
What it does

SOAP updates

Updates Subjective, Objective, Assessment, and Plan sections

ICD-10 suggestions

Suggests diagnosis codes with confidence scores

Clinical alerts

Surfaces safety and decision-support alerts

Clinical entities

Extracts medications, symptoms, diagnoses, vitals, procedures, and related findings

Post-encounter documentation

Supports note polishing, order preparation, and final review workflows when enabled

Encounter Entity

Each copilot session creates an encounter entity in the world model. The encounter accumulates all clinical intelligence from the session:

  • SOAP notes (incremental updates, then polished prose)

  • ICD-10 codes (suggested, approved, rejected)

  • Clinical alerts and safety flags

  • Clinical entities (medications, symptoms, diagnoses, vitals, procedures)

  • Encounter metadata (provider, patient, timestamps, duration)

Query the encounter entity via the Data & World Model API using the encounter_entity_id from the session_start event.

Patient Context

When patient_entity_id is provided, the platform loads the patient's full context from the world model before starting analysis:

  • Demographics, active conditions, current medications, allergies

  • Recent lab results, insurance details, encounter history

  • Preferred language (used to optimize transcription accuracy)

Patient context enables cross-referencing: mentioned medications are checked against the allergy list, new prescriptions against the current medication set, and care gaps are surfaced based on clinical history.

Without a patient entity, the copilot still transcribes and generates SOAP notes, but clinical cross-referencing and safety alerts are limited to what appears in the current encounter.

Reconnection

If the WebSocket disconnects unexpectedly, the encounter entity persists in the world model. To resume:

  1. Open a new WebSocket connection with the same workspace_id and service_id

  2. The previous encounter's data is preserved - query it via the encounter entity ID

  3. A new session creates a new encounter entity

The platform does not support resuming an in-progress session on the same encounter. Each connection creates a fresh encounter. For long encounters, consider the encounter entity as the durable state and the WebSocket as a transient audio transport.

Example: Minimal Client

Permissions

The WebSocket accepts a workspace-scoped API key or identity JWT. The token must be valid for the workspace_id in the connection URL. Patient context loading and encounter writes happen server-side under the workspace authorization attached to the token.

Use the workspace's configured clinical copilot service ID when opening a stream.

API Reference

Last updated

Was this helpful?