# Connectors and EHR

The connector system is the bidirectional data pipeline between external systems and the workspace's [world model](https://docs.amigo.ai/data/world-model). It polls EHR platforms, FHIR stores, CRMs, and other sources for new data, resolves entities across systems, reviews data quality, and syncs verified changes back to every configured destination.

The connector system has no direct API. It operates as a background service. Data sources and sync configuration are managed through the Platform API's Data Sources and Connector Settings endpoints. Connector definitions are stored in workspace settings with typed configuration models that enforce valid sync strategies and connection parameters. The data source response includes a computed staleness indicator based on the connector's last successful sync and its configured cadence. The system supports multiple connector types - each producing raw records that pass through a unification engine before entering the world model as events.

## How Connectors Work

External healthcare systems are unreliable in ways that are difficult to predict. EHR APIs go down for maintenance without warning. FHIR stores enforce rate limits that change between environments. Some systems only accept inbound writes during business hours. A write that succeeds today may be silently rejected tomorrow.

The connector system uses multiple concurrent background loops because each one backs up the others. If the poll loop misses an update because an external API timed out, a reconciliation loop catches it shortly after. If an outbound write fails, it is retried on the next dispatch cycle. If a real-time notification is lost due to a transient network issue, the reconciliation poll picks up the unsent event and re-queues it.

**Inbound** data arrives through two mechanisms depending on the source system's capabilities:

* **Polling** - The poll loop checks external systems on a configurable cadence. Incoming data is deduplicated by content hash - if the same record is returned twice, no duplicate event is created. Each data source connection can specify per-resource polling frequencies.
* **Real-time webhooks** - For systems that support FHIR Subscriptions or event notifications, the connector receives push notifications as changes happen. Incoming webhooks are verified, deduplicated, and the full resource is fetched and mapped before writing to the world model.

Both paths feed into the same entity resolution and enrichment pipeline. The poll loop supports system-specific adapters that handle the quirks of individual external systems, including business-hour gating, alternating between full and incremental sync cycles, reference data caching, and rate limit management.

**Outbound** write-back does not send raw event data. It reconstructs the complete current state of each entity by running the projection function across every event from every source, at every confidence level, with conflicts resolved through the world model's standard resolution rules. The outbound payload is this authoritative projected state, translated into whatever format the target system expects. A workspace can write back to multiple external systems simultaneously - the EHR receives patient and appointment data while the CRM receives contact and engagement data. Sync progress is tracked per-sink, so a failure writing to one system does not block writes to others.

There is also a throughput problem. The agent engine can handle dozens of concurrent interactions, each generating scheduling requests, insurance checks, and record updates. The EHR on the other end might accept a fraction of that volume. The world model absorbs this mismatch: agents write events at the speed of conversation, and the connector system drains the outbound queue at whatever rate the external system can handle. Patients get immediate confirmations. EHRs get the writes when they can process them.

{% @mermaid/diagram content="graph TD
CR\[Config Refresh] --> Loops
subgraph Loops \[Concurrent Background Loops]
direction LR
IP\[Inbound Poll]
WH\[Webhook Listener]
OS\[Outbound Subscriber]
OD\[Outbound Dispatch]
RC\[Reconciliation]
ER\[Entity Resolution]
RL\[Review Loop]
GS\[Gap Scanner]
end
IP --> WM\[(World Model)]
WH --> WM
OS --> EXT\[External Systems]
WM --> OS
RC --> WM
ER --> WM
RL --> WM
OD --> AE\[Agent Engine]
GS --> SF\[Surfaces]" %}

## Connector Types

| Connector      | Inbound | Outbound | Real-time Push    | Polling  | Auth                                |
| -------------- | ------- | -------- | ----------------- | -------- | ----------------------------------- |
| **EHR**        | Yes     | Yes      | Where supported   | Fallback | Per adapter (SMART, OAuth, API key) |
| **FHIR Store** | Yes     | Yes      | Via Subscriptions | Primary  | SMART Backend Services              |
| **CRM**        | Yes     | Yes      | Via webhooks      | Primary  | OAuth 2.0                           |
| **REST API**   | Yes     | No       | No                | Primary  | Configurable                        |
| **File Drop**  | Yes     | No       | No                | Primary  | Cloud IAM                           |
| **Webhook**    | Yes     | No       | Yes               | No       | Signature verification              |

**EHR connectors** use dedicated, EHR-specific adapters. Each adapter handles the target system's authentication, FHIR resource mapping, rate limits, and data format translation. Where the EHR supports real-time event notifications, the adapter receives change events as they happen instead of polling.

**FHIR Store connectors** connect directly to FHIR R4 stores with typed configuration and per-resource poll cadences. Outbound write-back uses optimistic locking to prevent lost updates.

**CRM connectors** support bidirectional contact and engagement sync with incremental per-object polling cadences. CRM objects (contacts, companies, deals) map to world model entity types, enabling a unified view of patients across clinical and engagement systems.

**REST connectors** poll HTTP endpoints on a schedule with multiple pagination strategies, configurable authentication, and circuit breaker protection. Content-hash deduplication prevents duplicate events from unchanged data.

**File Drop connectors** watch a cloud storage location for new files. They parse CSV, NDJSON, FHIR Bundles, and raw JSON - useful for bulk data imports where a partner drops a file on a schedule.

**Webhook connectors** receive inbound HTTP webhooks from external systems. Events are deduplicated by content hash to handle retries from the sender.

All connector types share common resilience patterns: retries with exponential backoff, circuit breakers that temporarily stop requests to failing sources, dead letter logging for records that cannot be processed, and content-hash deduplication across retries.

### Unification Engine

The unification engine is not a connector itself. It is the transformation layer that all connectors feed into. Raw records from any connector type are mapped to world model events using configurable rules with dot-path field extraction to pull values from arbitrarily nested source data into the event schema. This architecture decouples the transport layer (how data arrives) from the transformation layer (how data is mapped). Adding a new data source requires connector configuration and mapping rules - no custom integration code.

### Data Freshness

How quickly data appears in the world model depends on the connector type and configuration:

* **Webhook and push-based EHR connectors** - Near real-time. Data arrives within seconds of the external system's event.
* **Polling connectors (REST, FHIR Store, CRM)** - Determined by poll interval. Default intervals vary by connector and resource type.
* **File drop** - Depends on when the file is deposited. The connector checks for new files on each poll cycle.

All connectors feed through the same confidence gates before data reaches the world model. Inbound data enters at source-appropriate confidence (1.0 for authoritative system integrations, 0.7 for scraped data) and passes through automated review before becoming available to agents.

<figure><img src="https://3635224444-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvcLyiHRcwv7g83p6vxAd%2Fuploads%2Fgit-blob-b39769e0933b6677ea7a7b8da2f848dc3f9943f7%2Fconnectors-flow-green.svg?alt=media" alt="Connector data flow: external systems through connector types to unification engine to world model"><figcaption></figcaption></figure>

## EHR Integration

Clinical data flows in both directions: from the EHR into the world model (so the agent has context during interactions), and from the world model back to the EHR (so information captured during conversations reaches the clinical record after verification). The world model sits between the agent and the EHR as a buffer that absorbs throughput mismatches, quality differences, and availability gaps. The agent reads from and writes to the world model's clean projected state. The connector system handles the messy reality of getting data in and out of external systems at whatever rate they can handle.

### FHIR R4 and SMART Authentication

Both the `fhir_store` and `smart_fhir` connector types share the same underlying FHIR R4 capabilities - the difference is the authentication method. For EHR systems that implement the SMART App Launch specification, the connector supports SMART Backend Services authentication: the standard machine-to-machine auth flow for healthcare APIs.

The authentication flow uses JWT client assertions signed with RS384 or ES384. Access tokens are cached and refreshed before expiry. Per-resource scoping (e.g., `system/*.read`, `system/Patient.write`) controls which resources the connector can access.

SMART FHIR data sources can be created and managed entirely through the Developer Console or the Platform API. When you provide a private key during setup, the platform automatically provisions it to secure storage and stores only a reference - the key value is never persisted in the data source configuration or returned in API responses.

For all FHIR-capable systems, the connector provides patient search, resource CRUD with entity cross-referencing, resource history with field-level change tracking, bundle import for bulk data loading, and scoped sync failure investigation.

### Vendor-Specific Adapters

For EHR systems with non-standard FHIR implementations or proprietary authentication requirements, the connector includes purpose-built adapters that handle vendor-specific quirks transparently. These adapters manage authentication flows, non-standard pagination schemes, adaptive rate limiting, and resource type tiering by change frequency - while feeding into the same world model pipeline as all standard FHIR connectors.

For systems with no API at all, browser-tier tools can automate portal interactions. The content-hash deduplication, entity resolution, and confidence gates work the same way regardless of the underlying connection method.

### Handling External System Limitations

Real-world EHR integrations face problems that no API specification can solve. **Browser-only systems** with no API are handled through automated browser sessions that navigate the portal and extract confirmation data while the agent stays on the call. **Throughput mismatches** are absorbed by the world model - the agent writes intent as events, and the connector drains the outbound queue at whatever rate the external system accepts. **Downtime and crashes** are handled by the reconciliation loop, which catches unsynced events and retries them. **Stale or conflicting data** is resolved through the world model's confidence-based resolution. **Partial failures** in multi-step workflows are tracked per-step so completed steps are not re-executed. **Inconsistent data formats** across EHR systems are normalized by entity resolution and projection functions into a consistent representation.

## Outbound Write-Back

When the connector writes data back to an external system, it reads from the high-quality projected entity state, not from the raw event stream. Individual events may contain errors, low-confidence extractions, or partial information. The projection function filters all of that into a coherent, resolved entity. External systems only see the clean result.

For systems with complex mapping requirements, the connector uses an LLM to translate field values - for example, mapping a patient's description of their insurance to the target system's specific carrier codes. If the LLM is unavailable, a deterministic mapper handles common cases so writes continue during outages.

### Three-Layer Confidence Gates

Before any data is written back to an EHR or external system, it must pass through verification layers that prevent unverified information from contaminating systems of record.

| Gate                             | What It Checks                                                                                                                                            |
| -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Source eligibility**           | Only events from eligible sources are considered. Events originating from external system syncs are excluded to prevent echo loops.                       |
| **Confidence threshold**         | Is the event's confidence level high enough for outbound writes? Voice-extracted data (0.5) does not pass by default; it must be upgraded through review. |
| **Automated review**             | An automated judge evaluates the data against the original transcript or source material for accuracy.                                                    |
| **Human review** (when required) | For data flagged as uncertain or categories requiring human sign-off, an operator reviews and approves or rejects the write.                              |

When outbound writes have dependencies (for example, creating an appointment requires the patient to exist in the EHR first), the dependency check is confidence-aware. If the dependency entity's confidence is below the outbound threshold, the dependency is treated as failed rather than pending.

{% hint style="danger" %}
Data captured from a phone call never writes directly to the EHR. It always passes through the confidence gates above.
{% endhint %}

### Reconciliation Safety Net

The outbound path uses a dual-delivery design. A real-time subscriber listens for new verified events and routes them through the handler registry immediately. A separate reconciliation loop periodically scans for events that should have been synced but were missed by the real-time mechanism (due to transient failures, network issues, or timing gaps). Any missed events are re-queued for dispatch. This means no event is permanently lost, even in the face of infrastructure failures. The real-time path handles the common case with low latency. The reconciliation path handles the failure case with guaranteed delivery.

### Autonomous Record Creation

When the agent identifies a new patient during an interaction who does not exist in the external system, the connector can autonomously create the record. It assembles the required fields from world model events and submits the creation request to the appropriate adapter, handling field validation, format requirements, and error recovery.

## Review Loop

Events that were written at lower confidence levels (voice extraction, agent inference) pass through an automated review pipeline that evaluates accuracy and completeness. The review loop is event-driven - when the world model writer commits a low-confidence event, it publishes a notification that the review loop picks up and processes within seconds. A periodic safety-net poll catches any events missed by the real-time path. The loop deduplicates at the entity level to prevent queue bloat when a single entity generates many events in rapid succession. Events that pass review have their confidence upgraded. Events that fail are flagged for human review through the [review queue](https://docs.amigo.ai/data/review-queue).

## Entity Resolution

When data arrives from different systems, it often refers to the same real-world person, location, or appointment using different identifiers. Entity resolution matches these records together using a two-tier approach.

**Tier 1 (deterministic)** matches on exact identifiers - canonical FHIR IDs, phone numbers, email addresses, name + date of birth for patients, NPI or name + specialty for practitioners.

**Tier 2 (fuzzy)** applies when exact matching finds nothing. It uses approximate strategies: token-level name similarity, partial phone matching (last seven digits), and name + zip code combinations. Fuzzy matches require high similarity thresholds to prevent false merges.

For example: a patient might appear as "Jane Smith, DOB 03/15/1982" in the EHR and as a phone number in the voice system. Entity resolution determines these refer to the same person and links their events to a single patient entity. If the EHR record has a slight name variation ("J. Smith") or a different phone format, the fuzzy tier catches the match that exact comparison would miss.

When a workspace has multiple data sources, entity resolution also performs **cross-source merge detection**. If a patient entity from the EHR matches a patient entity from the CRM, the system creates a reversible link between them and unifies their projected state. This means a single patient view reflects data from every connected system, not just one.

## Outbound Dispatch

The connector system also handles scheduled outbound interactions. When the system needs to contact a patient (appointment reminders, follow-ups, outreach campaigns), a dispatch loop checks the schedule and initiates voice calls or text sessions through the agent engine. Each outbound task carries the patient context, interaction purpose, and priority. The dispatch loop evaluates which tasks are due, checks that the agent engine has capacity, and initiates interactions with the full patient context pre-loaded.

Outbound tasks are stored as entities in the world model. They are created by scheduling rules, follow-up workflows, or manual triggers, and their projections track status, priority, attempt count, retry timing, and call outcome.

## Gap Scanner

The gap scanner proactively identifies missing data across entities and creates [surfaces](https://docs.amigo.ai/channels/surfaces) to collect it. It checks entity state against configurable requirements (for example, "patients with upcoming appointments must have insurance information") and generates data collection forms for any gaps found.

Appointment detection uses a batch query that joins entity relationships with appointment events at scan time, running once per workspace per tick rather than per entity. This approach reads current event state directly, so appointment updates (cancellations, rescheduling) are reflected immediately without waiting for a stale projection to catch up.

Each entity is scanned at most once per cooldown period (default 7 days) to prevent notification fatigue. The scanner is disabled by default and configured per workspace through the Platform API. See [Surfaces - Automated Gap Detection](https://docs.amigo.ai/channels/surfaces#automated-gap-detection) for details.

## Pipeline Observability

The connector system tracks operational health across all background loops and data source connections. This gives operations teams real-time visibility into pipeline status without waiting for sync failures to surface.

| Metric                     | What It Tracks                                                                                                                     |
| -------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| **Overall status**         | Whether the pipeline is healthy, degraded, or starting up                                                                          |
| **Per-source poll health** | Last poll time, duration, event count, errors, and connection status for each data source                                          |
| **Loop states**            | Current state of each background loop (entity resolution, review, outbound, reconciliation)                                        |
| **Connection health**      | Consecutive error tracking per source - a source is marked unhealthy after repeated failures and recovers automatically on success |

The Platform API exposes pipeline observability through read-only endpoints that power the pipeline dashboard: pipeline status, source listing with live health, source event history, outbound sync summaries, entity resolution metrics, review queue depth, and throughput time series. The dashboard degrades gracefully - if the connector system is temporarily unavailable, the dashboard still shows database-backed metrics without live loop status.

{% hint style="info" %}
Pipeline health data feeds the analytics dashboard. See [Data Quality Analytics](https://docs.amigo.ai/intelligence-and-analytics/intelligence) for the dashboard metrics. For API endpoints and integration details, see the [Connector Runner](https://docs.amigo.ai/developer-guide/platform-api/platform-api/connector-runner) and [FHIR](https://docs.amigo.ai/developer-guide/platform-api/platform-api/fhir) sections of the developer guide.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.amigo.ai/data/connectors-and-ehr.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
