# Connectors and EHR

## Data Sources Dashboard

The Developer Console Data Sources page provides an operational overview of all external systems feeding data into a workspace. The page opens with a visual pipeline summary showing three stages: source systems (clinical and operational connectors), the normalize-and-map layer (FHIR resources, file uploads, and unification rules), and the world model output that agents and operators consume.

Summary metrics display the number of connected sources, healthy connector count, and recent sync activity. The connected sources table shows individual connector status and recent event throughput.

## Facility Location Mapping

When location, facility, or place entities are synced from clinical systems, the Data Insights dashboard renders their geographic distribution on a live map. Entities with direct coordinates (latitude and longitude) are plotted immediately. Entities that carry street addresses but no coordinates are geocoded on demand. This gives operations teams a visual picture of facility coverage without requiring manual coordinate entry. Integration

The connector system is the bidirectional data pipeline between external systems and the workspace's [world model](/data/world-model.md). It brings in data from EHR platforms, FHIR stores, CRMs, and other sources, resolves entities across systems, reviews data quality, and syncs verified changes back to every configured destination.

The connector system has no direct public API. Data sources and sync configuration are managed through the Platform API's Data Sources and Connector Settings endpoints. Connector definitions use typed configuration models that enforce valid sync strategies and connection parameters. Data source responses include a freshness indicator based on recent successful ingestion. The system supports multiple connector types - each producing raw records that pass through a unification engine before entering the world model as events.

## How Connectors Work

External healthcare systems are unreliable in ways that are difficult to predict. EHR APIs go down for maintenance without warning. FHIR stores enforce rate limits that change between environments. Some systems only accept inbound writes during business hours. A write that succeeds today may be silently rejected tomorrow.

The connector system combines push-based ingestion, scheduled ingestion, retry handling, and reconciliation. If an external API times out, a later reconciliation pass can catch the missed update. If an outbound write fails, it is retried according to the connector's delivery policy. If a real-time notification is lost due to a transient network issue, reconciliation recovers the missed event.

**Inbound** data arrives through two mechanisms depending on the source system's capabilities:

* **Scheduled sync** - The connector checks external systems on the cadence configured for that source. Incoming data is deduplicated by content hash - if the same record is returned twice, no duplicate event is created.
* **Real-time webhooks** - For systems that support FHIR Subscriptions or event notifications, the connector receives push notifications as changes happen. Incoming webhooks are verified, deduplicated, and the full resource is fetched and mapped before writing to the world model.

Both paths feed into the same entity resolution and enrichment pipeline. Source-specific adapters handle quirks such as business-hour gating, incremental sync, reference data handling, and rate limit management.

**Outbound** write-back does not send raw event data. It reconstructs the complete current state of each entity by running the projection function across every event from every source, at every confidence level, with conflicts resolved through the world model's standard resolution rules. The outbound payload is this authoritative projected state, translated into whatever format the target system expects. A workspace can write back to multiple external systems simultaneously - the EHR receives patient and appointment data while the CRM receives contact and engagement data. Sync progress is tracked per-sink, so a failure writing to one system does not block writes to others.

There is also a throughput problem. Agents may generate scheduling requests, insurance checks, and record updates faster than an external system can accept them. The world model absorbs this mismatch: the patient interaction records intent immediately, and the connector delivers verified writes when the external system is ready.

## Connector Types

| Connector      | Inbound | Outbound | Push-Based        | Scheduled Sync | Auth                           |
| -------------- | ------- | -------- | ----------------- | -------------- | ------------------------------ |
| **EHR**        | Yes     | Yes      | Where supported   | Fallback       | Per adapter                    |
| **FHIR Store** | Yes     | Yes      | Via Subscriptions | Primary        | API key or service credentials |
| **SMART FHIR** | Yes     | Yes      | Via Subscriptions | Primary        | SMART Backend Services         |
| **CRM**        | Yes     | Yes      | Via webhooks      | Primary        | OAuth 2.0                      |
| **REST API**   | Yes     | No       | No                | Primary        | Configurable                   |
| **File Drop**  | Yes     | No       | No                | Primary        | Managed credentials            |
| **Webhook**    | Yes     | No       | Yes               | No             | Signature verification         |

**EHR connectors** use dedicated, EHR-specific adapters. Each adapter handles the target system's authentication, FHIR resource mapping, rate limits, and data format translation. Where the EHR supports event notifications, the adapter receives change events as they happen.

**FHIR Store connectors** connect directly to FHIR R4 stores with typed configuration. Outbound write-back uses optimistic locking to prevent lost updates.

**SMART FHIR connectors** use the same FHIR R4 capabilities as FHIR Store connectors, but authenticate via SMART Backend Services - the standard machine-to-machine auth flow for healthcare APIs. Use this type when the EHR system requires SMART App Launch authentication.

**CRM connectors** support bidirectional contact and engagement sync. CRM objects such as contacts, companies, and deals map to world model entity types, enabling a unified view of patients across clinical and engagement systems.

**REST connectors** read HTTP endpoints on a schedule with multiple pagination strategies, configurable authentication, and circuit breaker protection. Content-hash deduplication prevents duplicate events from unchanged data.

**File Drop connectors** watch a cloud storage location for new files. They parse CSV, NDJSON, FHIR Bundles, and raw JSON - useful for bulk data imports where a partner drops a file on a schedule.

**Webhook connectors** receive inbound HTTP webhooks from external systems. Events are deduplicated by content hash to handle retries from the sender.

All connector types share common resilience patterns: retries with exponential backoff, circuit breakers that temporarily stop requests to failing sources, dead letter logging for records that cannot be processed, and content-hash deduplication across retries.

### Unification Engine

The unification engine is not a connector itself. It is the transformation layer that all connectors feed into. Raw records from any connector type are mapped to world model events using configurable rules with dot-path field extraction to pull values from arbitrarily nested source data into the event schema. This architecture decouples the transport layer (how data arrives) from the transformation layer (how data is mapped). Adding a new data source requires connector configuration and mapping rules - no custom integration code.

### Data Freshness

How quickly data appears in the world model depends on the connector type and configuration:

* **Webhook and push-based EHR connectors** - Near real-time. Data arrives within seconds of the external system's event.
* **Scheduled connectors (REST, FHIR Store, CRM)** - Determined by the source configuration.
* **File drop** - Depends on when the file is deposited and processed.

All connectors feed through the same confidence gates before data reaches the world model. Inbound data enters at source-appropriate confidence (1.0 for authoritative system integrations, 0.7 for scraped data) and passes through automated review before becoming available to agents.

<figure><img src="/files/E9JWnP2X3jBjNEJF3ym8" alt="Connector data flow: external systems through connector types to unification engine to world model"><figcaption></figcaption></figure>

## EHR Integration

Clinical data flows in both directions: from the EHR into the world model (so the agent has context during interactions), and from the world model back to the EHR (so information captured during conversations reaches the clinical record after verification). The world model sits between the agent and the EHR as a buffer that absorbs throughput mismatches, quality differences, and availability gaps. The agent reads from and writes to the world model's clean projected state. The connector system handles the messy reality of getting data in and out of external systems at whatever rate they can handle.

### FHIR R4 and SMART Authentication

Both the `fhir_store` and `smart_fhir` connector types share the same underlying FHIR R4 capabilities - the difference is the authentication method. For EHR systems that implement the SMART App Launch specification, the connector supports SMART Backend Services authentication: the standard machine-to-machine auth flow for healthcare APIs.

The platform handles token acquisition, caching, and renewal automatically. Per-resource scoping (e.g., `system/*.read`, `system/Patient.write`) controls which resources the connector can access.

### Bearer Token Exchange

Some APIs use a non-standard token-exchange flow instead of OAuth or static keys. The API requires a workspace-level secret and one or more dynamic, per-request parameters (such as a user identifier) to mint a short-lived bearer token for each call. This pattern is common in healthcare platforms where each API call must be scoped to a specific end user or tenant, but the token issuance mechanism does not follow the OAuth specification.

The `bearer_token_exchange` auth type handles this automatically. At configuration time, the operator provides the exchange endpoint URL, a workspace secret, and a mapping of which request parameters feed which exchange-call headers. At runtime, the connector mints a fresh bearer token for each distinct combination of dynamic parameters, caches it until expiry, and uses per-key locking to prevent concurrent token storms when many requests share the same identity. Cached tokens are keyed by a one-way hash of the dynamic parameters so no identifiers are stored in plaintext.

The exchange URL is validated against private network ranges at both configuration time and runtime to prevent SSRF. Format templates for the secret header are restricted to a single placeholder to prevent injection.

SMART FHIR data sources can be created and managed entirely through the Developer Console or the Platform API. When you provide a private key during setup, the platform automatically provisions it to secure storage and stores only a reference - the key value is never persisted in the data source configuration or returned in API responses.

For all FHIR-capable systems, the connector provides patient search, resource CRUD with entity cross-referencing, resource history with field-level change tracking, bundle import for bulk data loading, and scoped sync failure investigation. The FHIR API serves a broad set of clinical resource types - including Observation, MedicationStatement, FamilyMemberHistory, and QuestionnaireResponse - with patient-scoped filtering so queries return only resources associated with a specific patient.

### Vendor-Specific Adapters

For EHR systems with non-standard FHIR implementations or proprietary authentication requirements, the connector includes purpose-built adapters that handle vendor-specific quirks transparently. These adapters manage authentication flows, non-standard pagination schemes, adaptive rate limiting, and resource type tiering by change frequency - while feeding into the same world model pipeline as all standard FHIR connectors.

For systems with no API at all, browser-tier tools can automate portal interactions. The content-hash deduplication, entity resolution, and confidence gates work the same way regardless of the underlying connection method.

### Handling External System Limitations

Real-world EHR integrations face problems that no API specification can solve. **Browser-only systems** with no API are handled through automated browser sessions that navigate the portal and extract confirmation data while the agent stays on the call. **Throughput mismatches** are absorbed by the world model - the agent writes intent as events, and the connector drains the outbound queue at whatever rate the external system accepts. **Downtime and crashes** are handled by the reconciliation loop, which catches unsynced events and retries them. **Stale or conflicting data** is resolved through the world model's confidence-based resolution. **Partial failures** in multi-step workflows are tracked per-step so completed steps are not re-executed. **Inconsistent data formats** across EHR systems are normalized by entity resolution and projection functions into a consistent representation.

## Outbound Write-Back

When the connector writes data back to an external system, it reads from the high-quality projected entity state, not from the raw event stream. Individual events may contain errors, low-confidence extractions, or partial information. The projection function filters all of that into a coherent, resolved entity. External systems only see the clean result.

For systems with complex mapping requirements, the connector uses an LLM to translate field values - for example, mapping a patient's description of their insurance to the target system's specific carrier codes. If the LLM is unavailable, a deterministic mapper handles common cases so writes continue during outages.

### Three-Layer Confidence Gates

Before any data is written back to an EHR or external system, it must pass through verification layers that prevent unverified information from contaminating systems of record.

| Gate                             | What It Checks                                                                                                                                            |
| -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Source eligibility**           | Only events from eligible sources are considered. Events originating from external system syncs are excluded to prevent echo loops.                       |
| **Confidence threshold**         | Is the event's confidence level high enough for outbound writes? Voice-extracted data (0.5) does not pass by default; it must be upgraded through review. |
| **Automated review**             | An automated judge evaluates the data against the original transcript or source material for accuracy.                                                    |
| **Human review** (when required) | For data flagged as uncertain or categories requiring human sign-off, an operator reviews and approves or rejects the write.                              |

When outbound writes have dependencies (for example, creating an appointment requires the patient to exist in the EHR first), the dependency check is confidence-aware. If the dependency entity's confidence is below the outbound threshold, the dependency is treated as failed rather than pending.

{% hint style="danger" %}
Data captured from a phone call never writes directly to the EHR. It always passes through the confidence gates above.
{% endhint %}

### Reconciliation Safety Net

The outbound path includes reconciliation so verified events are not lost when an external system or network path has a transient failure. The common path handles low-latency delivery, and reconciliation handles recovery.

### Autonomous Record Creation

When the agent identifies a new patient during an interaction who does not exist in the external system, the connector can autonomously create the record. It assembles the required fields from world model events and submits the creation request to the appropriate adapter, handling field validation, format requirements, and error recovery.

## Review Loop

Events that were written at lower confidence levels, such as voice extraction or agent inference, pass through an automated review pipeline that evaluates accuracy and completeness. Review processing deduplicates at the entity level so a burst of related events does not create duplicate human work. Events that pass review have their confidence upgraded. Events that fail are flagged for human review through the [review queue](/data/review-queue.md).

## Entity Resolution

When data arrives from different systems, it often refers to the same real-world person, location, or appointment using different identifiers. Entity resolution matches these records together using a two-tier approach.

**Tier 1 (deterministic)** matches on exact identifiers - canonical FHIR IDs, phone numbers, email addresses, name + date of birth for patients, NPI or name + specialty for practitioners.

**Tier 2 (fuzzy)** applies when exact matching finds nothing. It uses approximate strategies: token-level name similarity, partial phone matching (last seven digits), and name + zip code combinations. Fuzzy matches require high similarity thresholds to prevent false merges.

For example: a patient might appear as "Jane Smith, DOB 03/15/1982" in the EHR and as a phone number in the voice system. Entity resolution determines these refer to the same person and links their events to a single patient entity. If the EHR record has a slight name variation ("J. Smith") or a different phone format, the fuzzy tier catches the match that exact comparison would miss.

When a workspace has multiple data sources, entity resolution also performs **cross-source merge detection**. If a patient entity from the EHR matches a patient entity from the CRM, the system creates a reversible link between them and unifies their projected state. This means a single patient view reflects data from every connected system, not just one.

## Outbound Dispatch

The connector system also handles scheduled outbound interactions. When the system needs to contact a patient, each outbound task carries the patient context, interaction purpose, and priority. The platform evaluates which tasks are due and initiates interactions with the relevant patient context pre-loaded.

Outbound tasks are stored as entities in the world model. They are created by scheduling rules, follow-up workflows, or manual triggers, and their projections track status, priority, attempt count, retry timing, and call outcome.

## Gap Scanner

The gap scanner proactively identifies missing data across entities and creates [surfaces](/channels/surfaces.md) to collect it. It checks entity state against configurable requirements (for example, "patients with upcoming appointments must have insurance information") and generates data collection forms for any gaps found.

Appointment detection reads current event state directly, so appointment updates such as cancellations and rescheduling are reflected without waiting for stale data to age out.

Gap scanning is configured per workspace through the Platform API. See [Surfaces - Automated Gap Detection](/channels/surfaces.md#automated-gap-detection) for details.

## Pipeline Observability

The connector system tracks operational health across data source connections. Operations teams get visibility into pipeline status without waiting for sync failures to surface.

| Metric                     | What It Tracks                                                                                                                     |
| -------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| **Overall status**         | Whether the pipeline is healthy, degraded, or starting up                                                                          |
| **Per-source sync health** | Recent sync time, duration, event count, errors, and connection status for each data source                                        |
| **Pipeline states**        | Current state of ingestion, review, outbound delivery, and reconciliation                                                          |
| **Connection health**      | Consecutive error tracking per source - a source is marked unhealthy after repeated failures and recovers automatically on success |

### Data Source Freshness

Each data source connection reports a freshness category based on how recently it last ingested data:

| Category  | Meaning                                                                                       |
| --------- | --------------------------------------------------------------------------------------------- |
| **Fresh** | Data arrived within the last five minutes. The source is actively producing events.           |
| **Stale** | Last ingestion was between five and sixty minutes ago. The source may be experiencing delays. |
| **Quiet** | No data in over an hour. The source may be down, or there may simply be no new records.       |
| **Never** | No data has ever been ingested from this source.                                              |

Freshness is computed from the timestamp of the most recent ingested event, not from the connector's poll schedule. A source that polls every 15 minutes but has no new data reports its freshness based on when the last actual event arrived, not when the last poll ran. This means freshness accurately reflects whether data is flowing, not just whether the connector is running.

Per-source ingestion rates (events per minute, per hour, and over the last 24 hours) give operations teams a quantitative view of throughput alongside the categorical freshness indicator.

### Sensing-to-Action Latency

The platform measures the end-to-end time from when a data change is detected in a source system to when the agent acts on it. This is the latency a patient experiences between, say, a lab result being posted in the EHR and the agent calling the patient about it.

The latency distribution is presented as an hourly sparkline with count and median values, so operations teams can spot degradation trends before they affect patient experience. A spiky latency profile often indicates external system slowdowns or queue backups, while a gradually rising profile suggests growing data volume outpacing processing capacity.

The Platform API exposes pipeline observability through read-only endpoints that power the pipeline dashboard: pipeline status, source listing with live health, source event history, outbound sync summaries, entity resolution metrics, review queue depth, and throughput time series. The dashboard degrades gracefully - if the connector system is temporarily unavailable, the dashboard still shows database-backed metrics without live loop status.

{% hint style="info" %}
Pipeline health data feeds the analytics dashboard. See [Data Quality Analytics](/intelligence-and-analytics/intelligence.md) for the dashboard metrics. For API endpoints and integration details, see the [Connector Runner](https://docs.amigo.ai/developer-guide/platform-api/platform-api/connector-runner) and [FHIR](https://docs.amigo.ai/developer-guide/platform-api/platform-api/fhir) sections of the developer guide.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.amigo.ai/data/connectors-and-ehr.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
