databaseWorld Model

Event-sourced data core that absorbs messy healthcare data, scores it by confidence, and projects clean unified entity state.

Every capability on the platform - voice agents, connector runner, operator workflows, outbound campaigns, analytics - depends on one thing: a shared, trustworthy picture of reality. The world model is that picture. It is not a database the agent queries. It is closer to memory. The agent knows its patient the way you know your own name - automatically, without effort, without a lookup step.

This collapses a boundary that most healthcare systems treat as fundamental: the line between the "data layer" and the "intelligence layer." In traditional architectures, data sits in a store and intelligence sits in an application that queries the store. In the world model, context flows into the agent automatically, and the agent's reasoning flows back as structured events. Data and intelligence are the same loop.

The architectural consequence is that dirty data becomes useful. The world model does not clean data at ingestion. It accepts raw events - EHR feeds, voice transcripts, browser scrapes, manual imports - each tagged with provenance and confidence. Clean entity state is not a precondition; it is a computed output. A deterministic projection function reads all events for an entity and produces the current picture. The centralized, high-quality core is what allows the messy periphery to work. You do not need perfect inputs. You need a system that can compute the right answer from imperfect ones.

circle-info

This is an event-sourced architecture. If you have worked with event sourcing in other systems, the principles are the same. If you have not, the core idea is straightforward: instead of updating records in place, you append facts to a log. The current state of any entity is derived by replaying the relevant facts.

Why Event Sourcing for Healthcare

Healthcare data is fundamentally unreliable. The world model's architecture absorbs this mess and turns it into something systems can rely on.

Most clinical data is low quality. Outside of billing, revenue cycle management, and some operational data, the information in healthcare systems is far from clean. Clinical notes are unstructured free text. Scribe outputs vary in accuracy depending on the model, the audio quality, and the complexity of the encounter. EHR inputs are frequently copy-pasted templates carried forward from visit to visit with minor edits, making it hard to distinguish current facts from stale ones. Billing and RCM data is structured because money depends on it. Clinical data does not have the same forcing function, and the quality reflects that.

Inbound data from patients is not trustworthy by default. Callers give wrong dates of birth, confuse medication names, misremember their doctor's name, or provide incomplete details. Some calls are pranks. Some are from people who are confused, stressed, or in pain. You cannot treat patient-provided information as verified fact. It is input that needs to be scored, compared against existing records, and promoted or discarded based on corroboration.

External systems have uneven reliability and throughput. EHRs, FHIR stores, practice management systems, and insurance verification services all behave differently. Response times vary. A write that succeeds on Monday might time out on Tuesday. Some systems return stale cached data. Others silently drop updates. Any architecture that assumes external systems are consistent and available will fail in production.

Traditional record-update approaches break down here. If you update records in place, you lose the trail of what the system believed and when. When a downstream write fails, you have no clean way to know what state you were trying to reach. When two sources disagree, the last write wins by accident, not by policy. Event sourcing with confidence scoring is the architectural answer: every fact is tagged with where it came from, how much to trust it, and what it supersedes. Nothing is overwritten. The full history is always available for replay, audit, or correction.

Four Invariants

The world model enforces four rules that never change regardless of how the system evolves.

1. Events Are the Only Source of Truth

There is no way to modify an entity's state directly. The only way to change what the system believes about a patient, appointment, or any other entity is to insert a new event. The entity's state is then recomputed from all of its events.

This eliminates a class of problems common in healthcare IT: two systems updating the same record concurrently, with the last write silently overwriting the first. In the world model, both writes are preserved as separate events. The projection function determines the current state using confidence ranking, not write order.

2. Events Are Append-Only and Immutable

Once an event is written, it is never modified or deleted. If new information contradicts an earlier event, a new event is created that supersedes the old one. The old event remains in the log permanently.

This is what makes dirty data tractable. You do not have to get it right the first time. Record what you learned. Learn more later. Newer events supersede older ones. Both are preserved - the original for audit, the correction for current state. In a domain where information arrives incomplete, out of order, and frequently wrong, immutability turns a liability into an asset: every mistake is recoverable, every correction is traceable, and no update can silently destroy what came before.

This matters for healthcare operations because it provides:

  • Audit trails - You can always answer "why did the system believe X at time Y?"

  • Temporal queries - You can reconstruct the state of any entity at any point in the past

  • Undo capability - Reversing a decision means inserting a new event, not deleting the old one

3. Entity State Is a Pure Function of Events

Given the same set of events, the system always produces the same entity state. The projection function is deterministic. Multiple processes can trigger recomputation concurrently and they will always arrive at the same result, because the function reads all current events and writes the output atomically.

This determinism is also what makes outbound reliability possible. When writing back to an EHR, the system does not send raw event data. It reconstructs the complete entity state from all events, then translates that projection into the EHR's format. Noisy incoming data does not propagate backward. A patient's phone number might arrive through a voice call at 0.5 confidence, get corroborated by an EHR lookup at 1.0 confidence, and get projected into a clean, authoritative record. That projected record - not the noisy inputs - is what flows to the downstream system.

4. Confidence Resolves Conflicts

When two sources provide conflicting information about the same fact (for example, a voice transcript says a patient's pharmacy is on Main Street, but the EHR record says Oak Avenue), the system does not use timestamps to pick a winner. Instead, it uses a confidence ranking based on the source. This is not just an engineering preference - it reflects the reality that most information entering a healthcare system has unknown reliability until it has been verified against another source.

Confidence
Source
Example

1.0

Authoritative API

Direct EHR integration, verified system writes

0.7

Browser scrape

Portal data captured through UI automation

0.5

Voice

Data extracted from a phone conversation

0.3

Agent inference

Data the agent derived or inferred from context

Within the same confidence class, the most recent event wins. Across confidence classes, higher confidence always wins regardless of recency. This means a verified EHR record will not be overwritten by something a caller mentioned on a phone call, but two consecutive EHR updates will resolve to the most recent one.

Three Data Channels

circle-info

The boundary between the agent and its data is not a query interface. It is three distinct channels, each reflecting a different relationship between the agent and what it knows.

Ambient

Data that is pushed into the agent's context automatically at the start of each conversation turn. The agent does not request this data; it is always present. Examples: patient name, upcoming appointments, recent encounter history.

This is the channel that dissolves the line between infrastructure and intelligence. The agent does not look up the patient's next appointment - it already knows, the same way a receptionist who has worked at a clinic for ten years knows. Ambient context is what makes the agent sound informed from the first moment of a call. When a patient calls and the agent says "I see you have an appointment this Thursday," no query ran. That information was already part of the agent's state.

Queried

Data that the agent retrieves on demand through tool calls during a conversation. The agent decides it needs specific information and requests it. Examples: searching for available appointment slots, looking up insurance details, checking medication lists.

Queried data covers information that is too large or too specific to include in every conversation turn. The agent pulls it when the conversation requires it. This is the traditional "application queries database" pattern, but scoped narrowly: most of what the agent needs arrives through the ambient channel. Queries handle the long tail.

Extracted

Data that the agent captures from the conversation and writes back to the world model as a natural consequence of thinking. When a patient provides new information during a call - a new phone number, an insurance change, a medication update - the agent writes that as an event with voice-level confidence (0.5).

During a live voice call, the system periodically extracts structured patient data from the conversation. Every few turns, an LLM reviews the recent transcript and captures key fields - phone numbers, dates of birth, email addresses, insurance carrier and member IDs, addresses - writing them to the world model as events with voice-level confidence (0.5). These extractions follow FHIR field normalization. This means the world model accumulates structured data in real time during the call, not just after it ends.

This is intelligence producing data, not just consuming it. The agent's reasoning generates structured facts as a byproduct of doing its job. There is no separate "data capture" step. Understanding the patient and capturing information are the same act.

Extracted data does not go directly to the EHR. It enters the world model, flows through the connector runner's confidence gates and review pipeline, and only reaches the EHR after verification. This is covered in detail in the EHR Integration section.

Open Schema

Traditional healthcare systems force data into fixed schemas - FHIR resources, HL7 segments, proprietary EHR tables. If information does not fit a predefined category, it gets shoehorned, truncated, or dropped. The schema is a constraint on what the system can know.

The world model inverts this. Entity types and event types are free-form text, not fixed enums. If the system discovers a new kind of entity or observation that does not fit existing categories, it creates a new type without requiring a database migration or schema change. Data arrives in any form, and the system structures it. The schema is not a constraint on intelligence - it is an output of intelligence.

This means agents discover structure rather than being limited by it. A conversation might reveal a relationship between a patient and a caregiver that no predefined schema anticipated. The world model accommodates it immediately. Over time, patterns in these emergent types can be formalized, but they are never blocked from entering the system in the first place.

For implementation details on how events are written, how entity state is projected, and how write scope isolation works, see World Model Internals.

circle-info

Developer Guide - For API endpoints, SDK examples, and integration details, see the Data & World Modelarrow-up-right in the developer guide.

Last updated

Was this helpful?