Connectors and EHR
Bidirectional data pipeline that syncs healthcare systems, FHIR stores, CRMs, and other external sources with the world model.
Data Sources Dashboard
The Developer Console Data Sources page provides an operational overview of all external systems feeding data into a workspace. The page opens with a visual pipeline summary showing three stages: source systems (clinical and operational connectors), the normalize-and-map layer (FHIR resources, file uploads, and unification rules), and the world model output that agents and operators consume.
Summary metrics display the number of connected sources, healthy connector count, and last sync event volume. The connected sources table shows individual connector status, sync cadence, and recent event throughput.
Facility Location Mapping
When location, facility, or place entities are synced from clinical systems, the Data Insights dashboard renders their geographic distribution on a live map. Entities with direct coordinates (latitude and longitude) are plotted immediately. Entities that carry street addresses but no coordinates are geocoded on demand. This gives operations teams a visual picture of facility coverage without requiring manual coordinate entry. Integration
The connector system is the bidirectional data pipeline between external systems and the workspace's world model. It polls EHR platforms, FHIR stores, CRMs, and other sources for new data, resolves entities across systems, reviews data quality, and syncs verified changes back to every configured destination.
The connector system has no direct API. It operates as a background service. Data sources and sync configuration are managed through the Platform API's Data Sources and Connector Settings endpoints. Connector definitions are stored in workspace settings with typed configuration models that enforce valid sync strategies and connection parameters. The data source response includes a computed staleness indicator based on the connector's last successful sync and its configured cadence. The system supports multiple connector types - each producing raw records that pass through a unification engine before entering the world model as events.
How Connectors Work
External healthcare systems are unreliable in ways that are difficult to predict. EHR APIs go down for maintenance without warning. FHIR stores enforce rate limits that change between environments. Some systems only accept inbound writes during business hours. A write that succeeds today may be silently rejected tomorrow.
The connector system uses multiple concurrent background loops because each one backs up the others. If the poll loop misses an update because an external API timed out, a reconciliation loop catches it shortly after. If an outbound write fails, it is retried on the next dispatch cycle. If a real-time notification is lost due to a transient network issue, the reconciliation poll picks up the unsent event and re-queues it.
Inbound data arrives through two mechanisms depending on the source system's capabilities:
Polling - The poll loop checks external systems on a configurable cadence. Incoming data is deduplicated by content hash - if the same record is returned twice, no duplicate event is created. Each data source connection can specify per-resource polling frequencies.
Real-time webhooks - For systems that support FHIR Subscriptions or event notifications, the connector receives push notifications as changes happen. Incoming webhooks are verified, deduplicated, and the full resource is fetched and mapped before writing to the world model.
Both paths feed into the same entity resolution and enrichment pipeline. The poll loop supports system-specific adapters that handle the quirks of individual external systems, including business-hour gating, alternating between full and incremental sync cycles, reference data caching, and rate limit management.
Outbound write-back does not send raw event data. It reconstructs the complete current state of each entity by running the projection function across every event from every source, at every confidence level, with conflicts resolved through the world model's standard resolution rules. The outbound payload is this authoritative projected state, translated into whatever format the target system expects. A workspace can write back to multiple external systems simultaneously - the EHR receives patient and appointment data while the CRM receives contact and engagement data. Sync progress is tracked per-sink, so a failure writing to one system does not block writes to others.
There is also a throughput problem. The agent engine can handle dozens of concurrent interactions, each generating scheduling requests, insurance checks, and record updates. The EHR on the other end might accept a fraction of that volume. The world model absorbs this mismatch: agents write events at the speed of conversation, and the connector system drains the outbound queue at whatever rate the external system can handle. Patients get immediate confirmations. EHRs get the writes when they can process them.
Connector Types
EHR
Yes
Yes
Where supported
Fallback
Per adapter (SMART, OAuth, API key, token exchange)
FHIR Store
Yes
Yes
Via Subscriptions
Primary
API key or service credentials
SMART FHIR
Yes
Yes
Via Subscriptions
Primary
SMART Backend Services
CRM
Yes
Yes
Via webhooks
Primary
OAuth 2.0
REST API
Yes
No
No
Primary
Configurable
File Drop
Yes
No
No
Primary
Cloud IAM
Webhook
Yes
No
Yes
No
Signature verification
EHR connectors use dedicated, EHR-specific adapters. Each adapter handles the target system's authentication, FHIR resource mapping, rate limits, and data format translation. Where the EHR supports real-time event notifications, the adapter receives change events as they happen instead of polling.
FHIR Store connectors connect directly to FHIR R4 stores with typed configuration and per-resource poll cadences. Outbound write-back uses optimistic locking to prevent lost updates.
SMART FHIR connectors use the same FHIR R4 capabilities as FHIR Store connectors, but authenticate via SMART Backend Services - the standard machine-to-machine auth flow for healthcare APIs. Use this type when the EHR system requires SMART App Launch authentication.
CRM connectors support bidirectional contact and engagement sync with incremental per-object polling cadences. CRM objects (contacts, companies, deals) map to world model entity types, enabling a unified view of patients across clinical and engagement systems.
REST connectors poll HTTP endpoints on a schedule with multiple pagination strategies, configurable authentication, and circuit breaker protection. Content-hash deduplication prevents duplicate events from unchanged data.
File Drop connectors watch a cloud storage location for new files. They parse CSV, NDJSON, FHIR Bundles, and raw JSON - useful for bulk data imports where a partner drops a file on a schedule.
Webhook connectors receive inbound HTTP webhooks from external systems. Events are deduplicated by content hash to handle retries from the sender.
All connector types share common resilience patterns: retries with exponential backoff, circuit breakers that temporarily stop requests to failing sources, dead letter logging for records that cannot be processed, and content-hash deduplication across retries.
Unification Engine
The unification engine is not a connector itself. It is the transformation layer that all connectors feed into. Raw records from any connector type are mapped to world model events using configurable rules with dot-path field extraction to pull values from arbitrarily nested source data into the event schema. This architecture decouples the transport layer (how data arrives) from the transformation layer (how data is mapped). Adding a new data source requires connector configuration and mapping rules - no custom integration code.
Data Freshness
How quickly data appears in the world model depends on the connector type and configuration:
Webhook and push-based EHR connectors - Near real-time. Data arrives within seconds of the external system's event.
Polling connectors (REST, FHIR Store, CRM) - Determined by poll interval. Default intervals vary by connector and resource type.
File drop - Depends on when the file is deposited. The connector checks for new files on each poll cycle.
All connectors feed through the same confidence gates before data reaches the world model. Inbound data enters at source-appropriate confidence (1.0 for authoritative system integrations, 0.7 for scraped data) and passes through automated review before becoming available to agents.
EHR Integration
Clinical data flows in both directions: from the EHR into the world model (so the agent has context during interactions), and from the world model back to the EHR (so information captured during conversations reaches the clinical record after verification). The world model sits between the agent and the EHR as a buffer that absorbs throughput mismatches, quality differences, and availability gaps. The agent reads from and writes to the world model's clean projected state. The connector system handles the messy reality of getting data in and out of external systems at whatever rate they can handle.
FHIR R4 and SMART Authentication
Both the fhir_store and smart_fhir connector types share the same underlying FHIR R4 capabilities - the difference is the authentication method. For EHR systems that implement the SMART App Launch specification, the connector supports SMART Backend Services authentication: the standard machine-to-machine auth flow for healthcare APIs.
The platform handles token acquisition, caching, and renewal automatically. Per-resource scoping (e.g., system/*.read, system/Patient.write) controls which resources the connector can access.
Bearer Token Exchange
Some APIs use a non-standard token-exchange flow instead of OAuth or static keys. The API requires a workspace-level secret and one or more dynamic, per-request parameters (such as a user identifier) to mint a short-lived bearer token for each call. This pattern is common in healthcare platforms where each API call must be scoped to a specific end user or tenant, but the token issuance mechanism does not follow the OAuth specification.
The bearer_token_exchange auth type handles this automatically. At configuration time, the operator provides the exchange endpoint URL, a workspace secret, and a mapping of which request parameters feed which exchange-call headers. At runtime, the connector mints a fresh bearer token for each distinct combination of dynamic parameters, caches it until expiry, and uses per-key locking to prevent concurrent token storms when many requests share the same identity. Cached tokens are keyed by a one-way hash of the dynamic parameters so no identifiers are stored in plaintext.
The exchange URL is validated against private network ranges at both configuration time and runtime to prevent SSRF. Format templates for the secret header are restricted to a single placeholder to prevent injection.
SMART FHIR data sources can be created and managed entirely through the Developer Console or the Platform API. When you provide a private key during setup, the platform automatically provisions it to secure storage and stores only a reference - the key value is never persisted in the data source configuration or returned in API responses.
For all FHIR-capable systems, the connector provides patient search, resource CRUD with entity cross-referencing, resource history with field-level change tracking, bundle import for bulk data loading, and scoped sync failure investigation. The FHIR API serves a broad set of clinical resource types - including Observation, MedicationStatement, FamilyMemberHistory, and QuestionnaireResponse - with patient-scoped filtering so queries return only resources associated with a specific patient.
Vendor-Specific Adapters
For EHR systems with non-standard FHIR implementations or proprietary authentication requirements, the connector includes purpose-built adapters that handle vendor-specific quirks transparently. These adapters manage authentication flows, non-standard pagination schemes, adaptive rate limiting, and resource type tiering by change frequency - while feeding into the same world model pipeline as all standard FHIR connectors.
For systems with no API at all, browser-tier tools can automate portal interactions. The content-hash deduplication, entity resolution, and confidence gates work the same way regardless of the underlying connection method.
Handling External System Limitations
Real-world EHR integrations face problems that no API specification can solve. Browser-only systems with no API are handled through automated browser sessions that navigate the portal and extract confirmation data while the agent stays on the call. Throughput mismatches are absorbed by the world model - the agent writes intent as events, and the connector drains the outbound queue at whatever rate the external system accepts. Downtime and crashes are handled by the reconciliation loop, which catches unsynced events and retries them. Stale or conflicting data is resolved through the world model's confidence-based resolution. Partial failures in multi-step workflows are tracked per-step so completed steps are not re-executed. Inconsistent data formats across EHR systems are normalized by entity resolution and projection functions into a consistent representation.
Outbound Write-Back
When the connector writes data back to an external system, it reads from the high-quality projected entity state, not from the raw event stream. Individual events may contain errors, low-confidence extractions, or partial information. The projection function filters all of that into a coherent, resolved entity. External systems only see the clean result.
For systems with complex mapping requirements, the connector uses an LLM to translate field values - for example, mapping a patient's description of their insurance to the target system's specific carrier codes. If the LLM is unavailable, a deterministic mapper handles common cases so writes continue during outages.
Three-Layer Confidence Gates
Before any data is written back to an EHR or external system, it must pass through verification layers that prevent unverified information from contaminating systems of record.
Source eligibility
Only events from eligible sources are considered. Events originating from external system syncs are excluded to prevent echo loops.
Confidence threshold
Is the event's confidence level high enough for outbound writes? Voice-extracted data (0.5) does not pass by default; it must be upgraded through review.
Automated review
An automated judge evaluates the data against the original transcript or source material for accuracy.
Human review (when required)
For data flagged as uncertain or categories requiring human sign-off, an operator reviews and approves or rejects the write.
When outbound writes have dependencies (for example, creating an appointment requires the patient to exist in the EHR first), the dependency check is confidence-aware. If the dependency entity's confidence is below the outbound threshold, the dependency is treated as failed rather than pending.
Data captured from a phone call never writes directly to the EHR. It always passes through the confidence gates above.
Reconciliation Safety Net
The outbound path uses a dual-delivery design. A real-time subscriber listens for new verified events and routes them through the handler registry immediately. A separate reconciliation loop periodically scans for events that should have been synced but were missed by the real-time mechanism (due to transient failures, network issues, or timing gaps). Any missed events are re-queued for dispatch. This means no event is permanently lost, even in the face of infrastructure failures. The real-time path handles the common case with low latency. The reconciliation path handles the failure case with guaranteed delivery.
Autonomous Record Creation
When the agent identifies a new patient during an interaction who does not exist in the external system, the connector can autonomously create the record. It assembles the required fields from world model events and submits the creation request to the appropriate adapter, handling field validation, format requirements, and error recovery.
Review Loop
Events that were written at lower confidence levels (voice extraction, agent inference) pass through an automated review pipeline that evaluates accuracy and completeness. The review loop is event-driven - when the world model writer commits a low-confidence event, it publishes a notification that the review loop picks up and processes within seconds. A periodic safety-net poll catches any events missed by the real-time path. The loop deduplicates at the entity level to prevent queue bloat when a single entity generates many events in rapid succession. Events that pass review have their confidence upgraded. Events that fail are flagged for human review through the review queue.
Entity Resolution
When data arrives from different systems, it often refers to the same real-world person, location, or appointment using different identifiers. Entity resolution matches these records together using a two-tier approach.
Tier 1 (deterministic) matches on exact identifiers - canonical FHIR IDs, phone numbers, email addresses, name + date of birth for patients, NPI or name + specialty for practitioners.
Tier 2 (fuzzy) applies when exact matching finds nothing. It uses approximate strategies: token-level name similarity, partial phone matching (last seven digits), and name + zip code combinations. Fuzzy matches require high similarity thresholds to prevent false merges.
For example: a patient might appear as "Jane Smith, DOB 03/15/1982" in the EHR and as a phone number in the voice system. Entity resolution determines these refer to the same person and links their events to a single patient entity. If the EHR record has a slight name variation ("J. Smith") or a different phone format, the fuzzy tier catches the match that exact comparison would miss.
When a workspace has multiple data sources, entity resolution also performs cross-source merge detection. If a patient entity from the EHR matches a patient entity from the CRM, the system creates a reversible link between them and unifies their projected state. This means a single patient view reflects data from every connected system, not just one.
Outbound Dispatch
The connector system also handles scheduled outbound interactions. When the system needs to contact a patient (appointment reminders, follow-ups, outreach campaigns), a dispatch loop checks the schedule and initiates voice calls or text sessions through the agent engine. Each outbound task carries the patient context, interaction purpose, and priority. The dispatch loop evaluates which tasks are due, checks that the agent engine has capacity, and initiates interactions with the full patient context pre-loaded.
Outbound tasks are stored as entities in the world model. They are created by scheduling rules, follow-up workflows, or manual triggers, and their projections track status, priority, attempt count, retry timing, and call outcome.
Gap Scanner
The gap scanner proactively identifies missing data across entities and creates surfaces to collect it. It checks entity state against configurable requirements (for example, "patients with upcoming appointments must have insurance information") and generates data collection forms for any gaps found.
Appointment detection uses a batch query that joins entity relationships with appointment events at scan time, running once per workspace per tick rather than per entity. This approach reads current event state directly, so appointment updates (cancellations, rescheduling) are reflected immediately without waiting for a stale projection to catch up.
Each entity is scanned at most once per cooldown period (default 7 days) to prevent notification fatigue. The scanner is disabled by default and configured per workspace through the Platform API. See Surfaces - Automated Gap Detection for details.
Pipeline Observability
The connector system tracks operational health across all background loops and data source connections. Operations teams get real-time visibility into pipeline status without waiting for sync failures to surface.
Overall status
Whether the pipeline is healthy, degraded, or starting up
Per-source poll health
Last poll time, duration, event count, errors, and connection status for each data source
Loop states
Current state of each background loop (entity resolution, review, outbound, reconciliation)
Connection health
Consecutive error tracking per source - a source is marked unhealthy after repeated failures and recovers automatically on success
Data Source Freshness
Each data source connection reports a freshness category based on how recently it last ingested data:
Fresh
Data arrived within the last five minutes. The source is actively producing events.
Stale
Last ingestion was between five and sixty minutes ago. The source may be experiencing delays.
Quiet
No data in over an hour. The source may be down, or there may simply be no new records.
Never
No data has ever been ingested from this source.
Freshness is computed from the timestamp of the most recent ingested event, not from the connector's poll schedule. A source that polls every 15 minutes but has no new data reports its freshness based on when the last actual event arrived, not when the last poll ran. This means freshness accurately reflects whether data is flowing, not just whether the connector is running.
Per-source ingestion rates (events per minute, per hour, and over the last 24 hours) give operations teams a quantitative view of throughput alongside the categorical freshness indicator.
Sensing-to-Action Latency
The platform measures the end-to-end time from when a data change is detected in a source system to when the agent acts on it. This is the latency a patient experiences between, say, a lab result being posted in the EHR and the agent calling the patient about it.
The latency distribution is presented as an hourly sparkline with count and median values, so operations teams can spot degradation trends before they affect patient experience. A spiky latency profile often indicates external system slowdowns or queue backups, while a gradually rising profile suggests growing data volume outpacing processing capacity.
The Platform API exposes pipeline observability through read-only endpoints that power the pipeline dashboard: pipeline status, source listing with live health, source event history, outbound sync summaries, entity resolution metrics, review queue depth, and throughput time series. The dashboard degrades gracefully - if the connector system is temporarily unavailable, the dashboard still shows database-backed metrics without live loop status.
Pipeline health data feeds the analytics dashboard. See Data Quality Analytics for the dashboard metrics. For API endpoints and integration details, see the Connector Runner and FHIR sections of the developer guide.
Last updated
Was this helpful?

