arrows-rotateConnector Runner

Seven background loops that poll EHR systems, resolve entities, review data quality, and sync verified changes bidirectionally.

The connector runner is the bidirectional data pipeline between external systems (EHR platforms, FHIR stores, REST APIs) and the workspace's world model. It continuously polls external systems for new data, resolves entities across systems, reviews data quality, and syncs verified changes back to the source.

The connector runner itself has no direct API. It operates as a background service. Data sources and sync configuration are managed through the Platform API's Data Sources endpoints.

The connector runner supports three connector types - REST API polling, S3 file drop, and inbound webhooks - plus a unification engine that transforms raw records from any connector into world model events. See Connector Types for details on each type.

Why Seven Loops

External healthcare systems are unreliable in ways that are difficult to predict. EHR APIs go down for maintenance without warning. FHIR stores enforce rate limits that change between environments. Some practice management systems only accept inbound writes during business hours. A write that succeeds on Monday may be silently rejected on Tuesday.

A single sync-and-forget approach does not survive this environment. The connector runner uses seven concurrent loops because each one backstops the others. If the poll loop misses an update because the external API timed out, the reconciliation loop catches it five minutes later. If an outbound write fails, it is retried on the next dispatch cycle. If a pub/sub message is lost due to a transient network issue, the reconciliation poll picks up the unsent event and re-queues it.

circle-exclamation

There is also a throughput problem. An AI voice agent can handle dozens of concurrent calls, each generating scheduling requests, insurance checks, and record updates. The EHR on the other end might accept a fraction of that volume. The connector runner and world model together absorb this mismatch: agents write events at the speed of conversation, and the connector runner drains the outbound queue at whatever rate the external system can handle. Patients get immediate confirmations. EHRs get the writes when they can process them.

Entity-First Outbound Writes

When the connector runner writes data back to an EHR, it does not send raw event data. It does not relay what the patient said on the phone. Instead, it reconstructs the complete current state of each entity by running the projection function across every event from every source, at every confidence level, with conflicts resolved through the world model's standard resolution rules.

The outbound payload is this authoritative projected state, translated into whatever format the target EHR expects. For how this translation works, see LLM-Assisted Translation below.

This distinction matters because raw inbound data is noisy. A patient might say "Blue Cross" during a call, but the EHR already has "BlueCross BlueShield of Illinois" at a higher confidence level from a prior eligibility check. The projection function resolves this: the higher-confidence value wins. When the connector runner writes back to the EHR, it writes the projected state - so the EHR keeps its authoritative version. The noisy voice-captured variant never propagates backward.

This is why the outbound path is reliable. It reads from the high-quality projected state, not from the raw event stream. Individual events may contain errors, low-confidence extractions, or partial information. The projection function filters all of that into a coherent, resolved entity. The EHR only ever sees the clean result.

Seven Background Loops

The connector runner runs seven concurrent processes that together form the complete data lifecycle.

chevron-right1. Config Refresh (60-second interval)hashtag

Reloads data source and EHR configuration. When you update a connector's settings through the API, this loop picks up the changes within one minute.

chevron-right2. Poll Loop (10-second interval)hashtag

Polls external systems that are due for a sync. Each poll runs under a distributed mutex to prevent duplicate processing when multiple connector runner instances are running. Incoming data is deduplicated using content hashing - if the same record is returned twice, it is not written as a duplicate event.

The poll loop supports EHR-specific adapters that handle the quirks of individual systems. For example, the Revolution EHR adapter:

  • Gates API calls to business hours (7am-8pm ET) because the EHR's API mirrors its intended usage pattern

  • Alternates between full sync cycles (~130-230 API calls) and light sync cycles (~10-20 API calls), running a full sync every fourth cycle (roughly once per hour)

  • Caches location details and insurance carrier lookups with 24-hour TTL to reduce API call volume

  • Spaces API calls with random delays (0.5-1.5 seconds) to avoid rate limiting

chevron-right3. Outbound Subscriber (real-time)hashtag

Listens for new events in the world model that need to be synced back to external systems. When a verified event is ready for writeback, this loop picks it up and queues it for dispatch. This is a pub/sub listener, so it reacts in near-real-time rather than on a polling interval.

The outbound subscriber uses a pub/sub channel (Valkey) for near-real-time event delivery. But pub/sub is fire-and-forget - if a pod restarts or a network blip occurs, messages are lost. The reconciliation loop (below) is the safety net.

chevron-right4. Reconciliation (300-second interval)hashtag

A safety net for the outbound subscriber. Every five minutes, this loop scans for events that should have been synced but were missed by the pub/sub mechanism (due to transient failures, network issues, or timing gaps). Any missed events are re-queued for dispatch.

This dual-path design (pub/sub + reconciliation) means no event is permanently lost, even in the face of infrastructure failures. The pub/sub path handles the common case with low latency. The reconciliation path handles the failure case with guaranteed delivery.

chevron-right5. Outbound Dispatch (30-second interval)hashtag

Fires scheduled outbound calls. When the system needs to place a call (appointment reminders, follow-ups, outreach campaigns), this loop checks the schedule and dispatches calls to the voice agent.

The dispatch loop reads outbound_task entities from the world model - these are created by scheduling rules, follow-up workflows, or manual triggers. Each task carries the patient context, call purpose, and priority. The dispatch loop evaluates which tasks are due, checks that the voice agent has capacity, and initiates calls with the full patient context pre-loaded.

chevron-right6. Entity Resolution (30-second interval)hashtag

Links unlinked events to entities. When data arrives from different systems, it often refers to the same real-world person, location, or appointment using different identifiers. Entity resolution matches these records together.

For example: a patient might appear as "Jane Smith, DOB 03/15/1982" in the EHR and as a phone number in the voice system. Entity resolution determines these refer to the same person and links their events to a single patient entity.

chevron-right7. Review Loop (30-second interval)hashtag

Runs the data quality pipeline. Events that were written at lower confidence levels (voice extraction, agent inference) pass through an LLM-based review judge that evaluates accuracy and completeness. Events that pass review have their confidence upgraded. Events that fail are flagged for human review.

Content-Hash Deduplication

Every piece of data polled from an external system is hashed. If a subsequent poll returns the same content, no new event is created. This prevents the event store from accumulating duplicate records when external systems return unchanged data across polling cycles.

Three-Layer Confidence Gates

The gates exist because inbound data from any source - including patients on the phone - may be wrong, incomplete, or fabricated. A patient might misremember a medication dosage. A voice extraction might mishear a drug name. An agent might infer a diagnosis code that was never explicitly stated. The gates ensure that unverified information does not contaminate systems of record.

Before any data is written back to an EHR or external system, it must pass through three layers of verification:

Gate
What It Checks

Source confidence

Is the event's confidence level high enough for outbound writes? Voice-extracted data (0.5) does not pass by default; it must be upgraded through review.

LLM review

An LLM judge evaluates the data against the original transcript or source material. Does the extracted data accurately reflect what was said or recorded?

Human review (when required)

For data that the LLM judge flags as uncertain or for categories that require human sign-off, an operator reviews and approves or rejects the write.

Confidence comparisons use epsilon tolerance rather than exact floating-point equality to prevent events from being silently excluded due to float rounding in IEEE 754 arithmetic.

This gating system means that data captured during a phone call does not flow directly into the EHR. It is verified at multiple stages before it reaches a system of record. The details of how this works for specific EHR integrations are covered in EHR and FHIR Integration.

LLM-Assisted Translation

When writing data back to an EHR, the projected entity state must be translated into the target system's format. For systems with complex mapping requirements, the connector runner uses an LLM (Gemini Flash) to translate field values intelligently - for example, mapping a patient's description of their insurance to the EHR's specific carrier codes.

If the LLM is unavailable, the system falls back to a deterministic mapper that handles common cases. This fallback ensures writes continue even during LLM outages, though with less nuanced translations.

Autonomous Patient Creation

When the voice agent identifies a new patient who doesn't exist in the EHR, the connector runner can autonomously create the patient record. This uses a purpose-built LLM agent (running on Vertex AI for HIPAA compliance) that assembles the required fields from world model events and submits the creation request to the EHR adapter. The agent handles field validation, format requirements, and error recovery.

circle-info

Developer Guide - For API endpoints, SDK examples, and integration details, see the Connector Runnerarrow-up-right in the developer guide.

Last updated

Was this helpful?