# Memory

Memory is workspace-scoped with configurable dimensions and extraction modes (static or LLM). Extracted facts are materialized into pre-computed tables for fast query access. The layered architecture (L0-L3) described below applies across all channels.

{% hint style="success" %}
**For Developers**: See the [REST API reference](https://docs.amigo.ai/developer-guide/reference/memory-architecture) for endpoint details, request/response schemas, and SDK code examples.
{% endhint %}

## The Memory Problem

Most memory systems either store everything (wasting resources) or use generic importance scoring (missing critical details). Neither approach works for high-stakes decisions.

Amigo's memory system keeps enough user context for decisions readily accessible during conversations. When a patient mentions chest tightness, the system surfaces their heart condition history, anxiety patterns, and medication context without waiting for ad-hoc retrieval. This minimizes latency while maintaining the information needed for real-time reasoning.

The result: healthcare decisions that properly account for how current symptoms connect to medical history, medication interactions, family patterns, and past treatment responses.

## Layered Memory Architecture

Memory is organized in four layers. Each layer compresses what came before while preserving what matters for outcomes.

{% @mermaid/diagram content="flowchart TD
L0\[L0: Raw Transcripts] -->|Extract new information| L1\[L1: Extracted Memories]
L1 -->|Synthesize episodes| L2\[L2: Episodic User Model]
L2 -->|Merge across episodes| L3\[L3: Global User Model]
L3 -.->|Guides extraction| L0
L3 -.->|Anchors synthesis| L1" %}

| Layer                         | What It Holds                                 | Role                                                                         |
| ----------------------------- | --------------------------------------------- | ---------------------------------------------------------------------------- |
| **L0 - Raw Transcripts**      | Complete conversation records                 | Ground truth for historical review and source material for extraction        |
| **L1 - Extracted Memories**   | Net-new facts from each conversation          | What changed since the last session, filtered by what L3 already knows       |
| **L2 - Episodic User Models** | Patterns synthesized across multiple sessions | Recurring structures that emerge over weeks or months                        |
| **L3 - Global User Model**    | Stable understanding merged across all time   | Always in memory during live sessions - the agent reasons from this directly |

L3 is the layer that matters most at runtime. It stays loaded in the agent's context during every session, so the agent never needs to retrieve patient history on the fly. When a patient mentions jaw pain, L3 already contains their cardiac history - jaw pain can be a non-traditional symptom of a heart attack, and the agent catches that connection immediately because the relevant context is already present.

<details>

<summary><strong>Example: How cross-domain connections surface</strong></summary>

A patient complains about jaw pain. A generic system searches for "jaw pain" and returns dental records. Amigo's memory works differently:

1. L3 already holds the patient's cardiac history, anxiety patterns, and medication list
2. The agent interprets jaw pain against this full context - recognizing it as a potential cardiac symptom
3. The agent's clinical identity shapes which connections matter most
4. Temporal patterns (is this new? recurring? correlated with stress?) are available without retrieval

The result: the agent asks about chest pressure and shortness of breath rather than suggesting a dentist visit.

</details>

<details>

<summary><strong>Example: Dimensional framework configuration</strong></summary>

Each dimension defines a category of information with precision requirements. This is what a dimension definition looks like:

{% code overflow="wrap" %}

```json
{
  "description": "Medical & Health History: Current and past health conditions, hormonal and metabolic profiles, treatment experiences, and medication adherence that provide context to the client's physical wellbeing.",
  "tags": ["health", "clinical", "medical history"],
  "precision_required": "perfect"
}
```

{% endcode %}

Dimensions requiring "perfect" precision are those where errors directly affect clinical outcomes - medication interactions can be life-threatening, so medical history must never be forgotten or miscontextualized. Other dimensions like exercise preferences might only need periodic updates.

</details>

## Key Features of Amigo's Memory System

<details>

<summary><strong>1. Recent Information Guarantee</strong></summary>

Amigo guarantees that recent information (last *n* sessions based on information decay for use case) is always available for:

* Full reasoning over the complete context
* Perfect recall of all details
* Recontextualization based on new understanding

This addresses the information decay problem common in memory systems.

</details>

<details>

<summary><strong>2. On-Demand Retrieval (Rare)</strong></summary>

When the agent needs information not in L3 (rare, adds latency):

1. **L3 covers most cases**: L3 supplies the full patient context at session start, so on-demand retrieval is uncommon
2. **Targeted queries**: The agent retrieves only the specific missing data rather than running broad searches
3. **Recontextualized Understanding**: Past L0 conversations recontextualized against current L3 understanding, enabling reasoning beyond simple retrieval
4. **Professional Context Filtering**: Service provider background guides what constitutes meaningful gaps requiring historical expansion
5. **Temporal Synthesis**: L3 bridges live session context with historical L0 context through dual anchoring mechanism

</details>

<details>

<summary><strong>3. Dimensional Evolution</strong></summary>

Amigo adapts its memory dimensions as patient patterns emerge:

**Core Capabilities:**

1. Professional identity guiding interpretation at every level of the memory hierarchy
2. System evolution of attention patterns based on discovered patient patterns
3. **Adaptive Dimensional Optimization**: When the system detects drift between user dimension definitions and optimal interpretation patterns for a patient group, it can modify dimensional definitions and perform complete temporal backfill

**Advanced features:**

4. **Replay-based reinterpretation**: When dimensions change, the system can reprocess historical data through the updated definitions - regenerating L0 through L3 with the new framework
5. **Cohort reinterpretation**: Dimension changes apply across entire patient groups, not just individual patients
6. **Feedback loops**: L3 patterns inform how L0/L1 extraction works, so the system refines what it captures based on what turns out to matter

**Concrete Example: Discovering Hidden Patterns**

Consider a patient whose blood sugar control seems randomly unstable:

* **Week 1-4 (L1 extraction)**: System captures seemingly unrelated mentions-work deadlines Tuesday, feeling stressed Thursday, missed medication Friday. Each seems like noise.
* **Month 2-3 (L2 accumulation)**: Patterns emerge from accumulated L1 data. A 2-3 week cycle appears: work stress -> medication timing disruption -> blood sugar instability.
* **Quarter 1-3 (L3 cross-episode analysis)**: Comparing multiple quarterly episodes reveals this isn't random-it's a stable functional dimension. The stress-medication-timing interaction becomes part of L3's dimensional blueprint.
* **Result**: What looked like random instability is actually a discoverable pattern. Now the system can proactively intervene when work stress patterns emerge, preventing blood sugar episodes.

This discovery was only possible because:

1. L1 captured seemingly irrelevant details (unfiltered extraction)
2. L2 aggregated over sufficient time for patterns to emerge (temporal aggregation)
3. L3 identified the pattern across multiple episodes (cross-episode analysis)

At population scale, only 10-50 such functional dimensions typically explain substantial outcome variance. The sparsity isn't imposed-it emerges as true causal patterns become visible while noise averages out.

</details>

<details>

<summary><strong>4. Enterprise Customizability</strong></summary>

Amigo's memory architecture is fully customizable for enterprise-specific needs. Six built-in dimensions ship as defaults - clinical, behavioral, operational, social, engagement, and risk - covering the most common healthcare memory needs. Workspaces can adjust these through the memory settings API:

* **Add custom dimensions** for domain-specific memory categories (e.g., "financial" for billing workflows, "legal" for compliance-sensitive interactions)
* **Adjust weights** to prioritize which dimensions receive more attention during memory extraction
* **Deactivate built-in dimensions** that are not relevant for a particular use case
* **Choose extraction mode** - Built-in dimensions use static extraction (fast, deterministic JSONPath matching). Custom dimensions default to LLM extraction, which uses the dimension's description as a semantic prompt to extract relevant facts from conversation events. LLM extraction handles nuanced, context-dependent information that static patterns would miss.

Memory dimension settings are managed through the Platform API and take effect on the next post-session processing run. No code changes or redeployment required. Memory analytics (coverage, fact counts, per-dimension breakdown) are available through the Developer Console's Functional Memory page.

For more complex customization, our Agent Engineers can work with you on a full implementation process:

1. **Critical Function Assessment**: Identify functions requiring near-perfect memory and map critical information types and hierarchy based on your use cases.
2. **Memory Design**: Configure memory topology and define user dimensions and parameters.
3. **Integration & Deployment**: Deploy memory system, connect to existing data sources and initialize user models.
4. **Verification & Optimization**: Validate functional performance, optimize dimensional parameters to increase performance where necessary.

</details>

## How Recall Works

L3 handles the vast majority of recall needs. Because it is always loaded during a session, the agent has immediate access to the patient's full dimensional profile - conditions, medications, behavioral patterns, risk factors - without any retrieval step.

In rare cases (roughly 5-10% of interactions), the agent encounters something that requires deeper historical context than L3 alone provides. When that happens, the system runs a targeted retrieval against L0 transcripts, guided by what L3 already knows. This is not a broad search - L3 shapes the query so it targets the specific gap. The retrieved history is then interpreted through L3's current understanding, not in isolation.

## Analytics Feedback Loop

Memory data feeds back into the system itself. Patterns discovered across patient populations - which dimensions predict adherence, which symptom clusters correlate with outcomes - refine the dimensional frameworks used for individual patients. When frameworks improve, the system can backfill historical data through the new lens, reinterpreting past conversations with better understanding. This is how individual interactions accumulate into organizational intelligence over time.

## Memory

{% hint style="info" %}
For text channel conversation persistence - including frozen conversation plans and turn history - see [Text Sessions](/channels/text-sessions.md#conversation-persistence).
{% endhint %}

as Safety Foundation

The memory system is a core part of Amigo's [safety framework](/safety-and-compliance/runtime-safety.md). Because L3 is always available, safety decisions always have access to complete context.

* **Crisis prevention** - Past crisis indicators and risk factors remain immediately accessible
* **Medication safety** - Complete medication history and adverse reactions guide pharmaceutical discussions
* **Risk awareness** - Safety-relevant dimensions are tracked with the highest precision requirements
* **Historical context** - Past events are understood through current clinical understanding, not in isolation

As detailed in [Operational Safety](/safety-and-compliance/runtime-safety.md), safety protections are built into the same reasoning process that drives all agent behavior rather than bolted on as separate filters.

## Post-Session Processing Pipeline

After a conversation ends, a multi-stage asynchronous pipeline processes the raw interaction data and materializes memory for fast query access:

### 1. Memory Extraction

An LLM reviews the full conversation transcript and extracts structured facts. Each fact receives an importance score (1-10), message-level references pointing to the specific messages that support it, and a context field explaining why this fact matters. The extracted memories are embedded as vectors and stored for semantic retrieval.

### 2. Memory Materialization

Extracted facts are classified into the workspace's configured dimensions and materialized into pre-computed tables. Each fact gets a human-readable `extracted_text` summary (derived from clinical event structures like diagnoses, medications, and procedures) so downstream consumers don't need to parse raw event data. Dimension-level aggregates (fact counts, average confidence, source diversity) are also pre-computed. This materialization step is what makes the Platform API's memory query endpoints fast - reads serve pre-computed results rather than running extraction at query time.

### 3. Metric Evaluation

If the service has configured quality metrics, the system evaluates them in parallel. Metrics are chunked (20 per prompt) and evaluated concurrently. Each evaluation includes a score, justification, and references to the specific messages that informed the assessment.

### 4. User Model Generation

Memories are processed in batches of 100. For each batch, an LLM generates dimensional user model observations tagged with the organization's configured user dimensions. When multiple batches exist, a merge step synthesizes them into a single cohesive user model. Intermediate results are checkpointed to prevent rework on failure.

Each stage has deduplication guards to prevent re-processing if the pipeline is triggered twice for the same conversation.

## Summary

In healthcare, memory that works "most of the time" is not good enough. Patient safety requires perfect recall of critical information, complete preservation of context across provider transitions, identification of information gaps before they affect care, and tracking of how patient understanding changes over time. Amigo achieves this by keeping L3 constantly available during sessions - the agent always has the patient's full context without retrieval delays.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.amigo.ai/agent/memory.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
