Glossary

This glossary provides definitions for key terms used throughout the Amigo documentation. It's designed to help enterprise readers better understand our platform's terminology and concepts, particularly those related to reasoning-focused AI development and macro-design optimization principles.

Note: Terms are organized by category for easier reference. For any term not found in this glossary, please contact your Amigo representative.

Need a primer first? Read Compositional Intelligence Dynamics for the full thesis, then return here for quick lookups while you explore the rest of the docs.

AI Development Phases & Core Principles - Foundational concepts including the Dimensional Sparsity Principle
Agent Architecture - Agent components, behaviors, and autonomy
Platform & Core Concepts - Core Amigo platform concepts including alignment and scaling
Context Graph Framework - Topological navigation and problem space structure
Memory Architecture - Layered memory system (L0-L3) and dimensional discovery
Information Theory & Mathematical Foundations - Formal foundations and theoretical framework
Integration Bridges - Memory-reasoning-knowledge integration
Processing Methods - Live and post-processing approaches
Metrics and Pattern Discovery - Evaluation, drift detection, and optimization
Actions & Execution - Action primitives and execution architecture
Multi-Agent & Game-Theoretic Concepts - Multi-agent coordination principles
Future Concepts & Architectures - Anticipated future developments

How to use this glossary: Start with AI Development Phases & Core Principles to understand foundational concepts like the Dimensional Sparsity Principle. For mathematical rigor, see Information Theory & Mathematical Foundations. Platform practitioners should focus on Platform & Core Concepts, Memory Architecture, and Metrics and Pattern Discovery. Terms are extensively cross-referenced—click any link to navigate to related concepts.

AI Development Phases & Core Principles

Dimensional Sparsity Principle: Outcome-relevant patterns almost always concentrate in a few dimensions, even when the underlying world is messy. If you identify and track those dimensions, you can predict or influence results nearly as well as if you observed everything. That is why simplified models—from orbital mechanics to risk scoring—work in practice. The dimensional blueprint specifies which parts of measurements matter for this object, how to bucket them, and how to interpret the values once extracted.

Formal definition

For any outcome $Y$ there exists a functional manifold $\mathcal{M}_Y \subset \mathcal{Z}$ such that $Y \perp\!\!\!\perp H \mid \pi_{\mathcal{M}_Y}(H)$ . Here $H$ is the joint human-plus-environment state accessed through a given interface (text, voice, devices) and $\pi_{\mathcal{M}_Y}$ projects it onto the manifold’s coordinates. In other words, once you know the manifold coordinates, additional state information does not change the prediction of $Y$ .

Outcome-Sufficient Representations: Compacted views of the world that preserve everything you need to hit a target outcome and nothing more. They are deliberately "wrong but useful"—built for a specific operational regime, monitored for drift, and efficient enough for real-time use. A high-frequency trading model that ignores macroeconomics yet consistently profits within its time horizon is a practical example.

Pre-training Phase: The initial phase of AI development focused on foundation data representation, where models learn basic patterns from large datasets. This phase has reached saturation having consumed most available human knowledge, with model quality scaling only logarithmically with additional data volume.

Post-training Phase: The phase focused on instruction following and personality development, where models learn to follow commands and exhibit consistent behavioral characteristics. This phase offers limited scaling potential through incremental improvements.

Reasoning Phase: The current frontier of AI development with no apparent scaling ceiling, where systems improve through better verification environments and feedback mechanisms rather than raw computational power or data accumulation. Characterized by "thin intelligence" properties where improvements transfer across domains.

Macro-Design Optimization: Approach focused on discovering the sparse latent variables that actually drive outcomes at scale, rather than optimizing within fixed dimensions. Macro-design discovers new latent dimensions through temporal aggregation, identifies causal variables that only emerge at scale, and refines understanding of discovered latent variables. Returns: compounding, potentially superlinear improvements from uncovering causal structure. Operates through the macro-design loop with population-level acceleration—multiple users enable faster dimensional discovery as shared patterns emerge across the population.

Micro-Design Optimization: Approach that operates within fixed dimensions through better architectures, training procedures, and datasets. Optimizes model weights given known features, tunes hyperparameters for existing variables, and improves data quality for predetermined dimensions. Returns: logarithmic improvements within known space. Most AI research focuses on micro-design, but real leverage comes from macro-design's dimensional discovery.

Macro-Design vs. Micro-Design: Fundamental distinction in optimization approaches. Micro-design tunes within a fixed coordinate system; macro-design changes the coordinates by discovering the few variables that actually move outcomes. Example: Micro-design tunes medication reminder timing within known schedule patterns (optimizing weights); macro-design discovers that stress-medication cycles exist as a new dimension through temporal aggregation (changing coordinates). The distinction parallels paradigm shifts versus incremental refinement in scientific progress. Treat drift as information about which missing dimension to discover next—each loop improves both the solution and the problem definition.

Observable Problem → Verification Cycle: The fundamental feedback architecture driving reasoning system improvement: Observable Problem → Interpretive/Modeling Fidelity → Verification in Model → Application in Observable Problem → Drift Detection → Enhanced Understanding. This cycle forms the foundation for continuous system improvement.

Distributed Exploration: Search regime where local workers branch through scenario variants while a global orchestrator allocates coverage, balancing unbiased domain sweeps with biased probes of likely failure modes. Workers act as generalist reasoners with access to current sufficient statistics, proposing next actions inside scenario variants. The orchestrator assigns sectors, reprioritizes coverage, and prunes redundant expeditions, ensuring workers only enter arcs whose contracts are validated for the synthesized statistics of their scenario. Two complementary regimes: unbiased exploration for representative coverage matching the domain's hazard profile, and biased exploration that densifies sampling around the solver's favourite heuristics to expose thin spots in defensive armour.

Macro-Design Loop: Extended feedback cycle that enables dimensional discovery and problem re-specification: Observable Problem → Modeling Fidelity → Verification → Application → Drift Detection → Re-specification. Distinguished from Observable Problem → Verification Cycle by explicit re-specification step where problem definitions $P$ themselves evolve as understanding deepens. When drift detection reveals that current dimensions are insufficient (dimensional drift), the loop doesn't just retrain—it fundamentally reframes what dimensions matter, expanding the acceptance region $A_U$ to include newly discovered functional dimensions. This enables monotonic improvement: each cycle potentially discovers better ways to define the problem itself. See also: Iterative Alignment / Continuous Alignment Loop.

Problem State Awareness: The system's ability to recognize when problems are fundamentally unsolvable versus when they can be transformed into solvable states, preventing overconfidence and inappropriate problem-solving attempts.

Quantized Reasoning: Breaking down complex reasoning into discrete steps where each quantum includes explicit confidence scoring, enabling systems to recognize problem boundaries and implement appropriate handoff mechanisms.

Thin Intelligence: The property where improvements in one domain transfer across other domains when representation learning occurs correctly—mathematical reasoning enhances chess performance, economics knowledge strengthens legal analysis.

Multi-Dimensional Success Criteria: Recognition that economic work unit verification extends beyond technical accuracy to encompass social factors, confidence building, emotional support, and organizational integration factors that determine real-world success.

Agent Architecture

Agent: Advanced conversational AI that navigates dynamically-structured contexts, using adaptive behavior to achieve a balance between situational flexibility and control.

Static Persona: The foundational identity layer of an agent defining its consistent attributes, including identity (name, role, language) and background (expertise, motivation, principles). Recommended to be less than 10k tokens as it serves as the foundation for axiomatic alignment rather than the "final portrait".

Global Directives: Explicit universal rules ensuring consistent agent behavior, including behavioral rules and communication standards that apply across all contexts.

Dynamic Behavior: System enabling real-time agent adaptation through context detection, behavior selection, and adaptive response generation. Dynamic behaviors can be triggered by conversational cues, agent actions, inner thoughts, or external events—this multi-source activation is what makes the system so powerful. Dynamic behavior scales to approximately 5 million characters (without side-effects) and can scale another order of magnitude larger with side-effects.

Trigger: Pattern, event, or signal that may activate a specific dynamic behavior. Triggers can originate from user messages (conversational cues), agent actions, agent inner thoughts, or external events. Functions as a relative ranking mechanism rather than requiring exact matches, enabling context-aware behavior activation from multiple sources.

Advanced Ranking Algorithm: Sophisticated multidimensional approach to behavior ranking that separately evaluates user context and conversation history, balancing immediate context with conversation continuity. Incorporates a mechanism for re-sampling previously selected behaviors with decaying recency weight to maintain relevance across longer interactions.

Behavior Chaining: An architectural capability that enables agents to influence their own trajectory through behavior spaces. By leveraging the embedding-based ranking system, agents can modify their conversational patterns to navigate between different clusters of potential behaviors. This creates a meta-control layer where the agent can direct its own path across behavior domains, allowing for structured conversational journeys that remain responsive to user inputs. When integrated with side-effects, behavior chaining functions as an orchestration layer for both conversation and external actions, enabling multi-turn, multi-modal experiences with transitions between dialogue and system interactions. Unlike traditional decision trees, behavior chaining maintains conversational coherence while providing predictable pathways across knowledge and interaction frameworks.

Behavior Selection Process: Four-step process (Candidate Evaluation including re-sampling of previous behavior, Selection Decision among new/previous/no behavior, Context Graph Integration, Adaptive Application) that determines how dynamic behaviors are identified and applied, allowing for persistence across turns.

Autonomy Spectrum: Framework describing how trigger and context design impact agent autonomy, from high autonomy (vague triggers with open context) to limited autonomy (strict triggers with precise instructions).

L4 Autonomy (in targeted domains): A strategic approach to AI development focusing on achieving high levels of autonomy (Level 4, analogous to full self-driving under specific conditions) in well-defined, strategically important areas or "neighborhoods." This prioritizes deep reliability and capability in critical functions over broader but potentially less reliable (e.g., L2) autonomy across all functions. Scaling L4 autonomy is viewed as a deliberate investment in money, strategy, and operational excellence.

Dynamic Behavior Side-Effect: Action triggered by a dynamic behavior that extends beyond the conversation itself and modifies the local context the agent is currently active in. These often represent low-entropy (deterministic) operations that provide reliable, predictable outcomes when precision is required. Every time a dynamic behavior is selected, the context graph is modified. Side-effects can include retrieving real-time data, modifying the context graph, generating structured reflections, integrating with enterprise systems, exposing new tools, triggering hand-offs to external systems, or adding new exit conditions.

Platform & Core Concepts

Alignment (AI): The ongoing challenge of ensuring AI system behaviors satisfy the multiple correlated objectives that define "success" for a specific organization, simultaneously rather than optimizing any single dimension. These objectives are unique to each organization's problem domain and values—discovered through observation and verification rather than predetermined. True alignment means operating within the multi-dimensional acceptance region $A_U$ that captures what the organization actually needs. This acceptance region evolves as the system discovers which functional dimensions actually drive desired outcomes through dimensional drift and the macro-design loop. Amigo's alignment-first design continuously maps the achievable Pareto frontier across these correlated objectives, enabling organizations to choose positions that match their values while understanding the real costs—computation, latency, development effort—of moving along or expanding the frontier over time. As capabilities increase and new dimensions emerge through temporal aggregation, alignment requires adapting to the evolving definition of success itself.

Entropy Control: The strategic management of degrees of freedom available to AI systems in different operational contexts, parameterized by policy entropy $e$ at each decision quantum. Given sufficient unified context $C$ , entropy control $e^{\star}(C)$ optimizes the trade-off between risk-sensitive performance and decision cost: lower entropy (more deterministic) in high-risk regions where mistakes are costly, higher entropy (more exploratory) where value-of-information justifies exploration. The key principle is entropy stratification: entropy control is conditional on $C$ being sufficient—the system collapses to low entropy when predictive uncertainty $H_p(C)$ or epistemic uncertainty $U(C)$ indicate risk, and allows higher entropy when sufficient context enables safe exploration. Implemented throughout Amigo's architecture: context density in context graphs (high-density = low entropy structured protocols), instruction flexibility spectrum in dynamic behaviors (rigid protocols = low entropy, open guidance = high entropy), and deterministic side-effects for precision-critical operations. Entropy stratification ensures reliability in safety-critical scenarios while maintaining adaptability where appropriate.

Instruction Flexibility Spectrum: The entropy control mechanism within dynamic behaviors, ranging from rigid protocols (low entropy) for safety-critical scenarios to open-ended guidance (high entropy) for creative problem-solving, with guided frameworks (medium entropy) for operational workflows.

Context Graph: Sophisticated topological field guiding AI agents through complex problem spaces. Functions as adaptable scaffolding, providing structure for reliability and alignment today while being designed to integrate with future AI paradigms like Neuralese. See also: "Context Graph" entry under Context Graph Framework.

Iterative Alignment / Continuous Alignment Loop

The systematic process where Amigo refines agent behavior by continuously discovering which functional dimensions actually drive desired outcomes, then optimizing across the achievable Pareto frontier of correlated objectives. Through the Partnership Model, domain experts define problem models and verification criteria that reveal the true acceptance region $A_U$ for their domain. Pattern Discovery then explores configuration space to map what trade-offs are achievable—where frontier movement (trading one objective for another) versus frontier expansion (improving multiple objectives simultaneously) is possible, and at what cost. As real-world deployment data accumulates through temporal aggregation, the system discovers new dimensions through dimensional drift, causing $A_U$ itself to evolve. This creates a feedback loop: better models reveal better problem definitions, better definitions enable better verification, better verification produces better models—with each cycle adapting to the changing definition of success as understanding deepens.

Layered Memory Architecture: Amigo's hierarchical memory structure (L0→L1→L2→L3) that enables dimensional discovery through temporal aggregation. Each layer maintains sufficient statistics while compressing: L0 (raw transcripts—ground truth), L1 (information gain—deviations from current understanding), L2 (episodic patterns over weeks/months), L3 (functional dimensions—stable patterns across episodes). This compression discovers which dimensions actually drive outcomes: patterns invisible at short timescales emerge through accumulation over longer horizons, ultimately maintaining sparse functional dimensions in L3 that explain outcome variance. These discovered dimensions shape the acceptance region and determine which positions on the Pareto frontier satisfy organizational needs. See also: Memory Architecture section below.

Evolutionary Chamber: The verification environment where candidate agent configurations compete under systematic evaluation to map the achievable Pareto frontier across correlated objectives. Configurations are tested against scenarios drawn from the deployment distribution, measuring outcomes across all dimensions that define the acceptance region $A_U$ . The chamber reveals fundamental trade-offs—which objectives correlate positively (improvable together through frontier expansion) versus negatively (requiring frontier movement with explicit sacrifices). Only configurations demonstrating comprehensive improvement advance: better performance on some objectives cannot come at the cost of violating constraints on others. Strategic pressures are defined through problem models and judges (co-developed via the Partnership Model), creating evolutionary pressure toward configurations that maintain admissibility margin across all objectives. As $A_U$ evolves through dimensional discovery, the chamber adapts verification criteria to test against the expanded dimensional space. This systematic exploration quantifies improvement costs—revealing whether gains require moderate compute reallocation (frontier movement) or expensive architectural innovations (frontier expansion). (See also: Pattern Discovery and Optimization)

Partnership Model (Amigo)

Amigo's collaborative approach to discovering and optimizing across the achievable Pareto frontier for each organization's unique objectives. Domain experts define the acceptance region $A_U$ —what outcomes count as successful—and build verification criteria that reveal which functional dimensions actually drive those outcomes. They track how competitive market realities and organizational priorities shift the frontier definition over time through dimensional drift. Agent Engineers leverage Agent Forge to systematically explore configuration space, mapping frontier positions and quantifying improvement costs. They determine whether gains require frontier movement (trading one objective for another at moderate cost) versus frontier expansion (architectural innovations improving multiple objectives simultaneously at high cost). Amigo provides the infrastructure—evolutionary chamber, layered memory, pattern discovery—that enables efficient recursive optimization under the strategic pressures defined by domain experts. This partnership enables organizations to understand their achievable trade-offs, choose frontier positions matching their values, and adapt as $A_U$ evolves with deepening understanding. Like Waymo's approach, we prioritize achieving reliable L4 autonomy in well-defined problem neighborhoods first, then systematically expanding to adjacent domains where the learned frontier structure transfers.

Scaling Policy (Λ): Resource allocation vector $\Lambda = (N, D, I, \mathcal{M}_{\text{mem}})$ comprising model parameters ( $N$ ), data distribution ( $D$ ), inference-time compute ( $I$ ), and memory capacity ( $\mathcal{M}_{\text{mem}}$ ). Distinguishes aligned scaling (allocating resources to outcome-relevant dimensions) from misaligned scaling (uniformly increasing all resources). Aligned scaling prioritizes: (1) data quality over quantity—curating examples that reveal functional dimensions; (2) inference compute on verification and search over training compute; (3) memory systems that discover and maintain sufficient statistics; (4) parameters allocated to outcome-relevant model capabilities. Misaligned scaling naively increases context windows, model size, and data volume without targeting what drives outcomes, leading to diminishing returns as predicted by the dimensional sparsity principle.

Regime-Bounded Validity: Approach where models are explicitly valid within operational regimes $\mathcal{R} \subset \mathcal{X}$ with drift detection $\mathcal{D}: \mathcal{X} \rightarrow \{0,1\}$ to trigger recalibration when $x \notin \mathcal{R}$ . Core principle: models should know their boundaries and escalate when encountering inputs outside their trained regime rather than confidently extrapolating. Implemented through Operational Patient Domains (OPDs) that specify inclusions/exclusions, confidence targets, and escalation policies. Enables "wrong but useful" models that maintain reliability within defined boundaries while detecting when those boundaries are exceeded.

Effective Theory Lens: Physics-inspired approach to building AI systems by coarse-graining to sufficiency rather than completeness. Core principles: (1) Build representations sufficient for outcomes, discarding irrelevant detail; (2) Define explicit regime boundaries—OPDs specify where models are valid; (3) Use information bottlenecks and rate-distortion to tune compression; (4) Trust through verification under real distributions rather than modeling more detail. Similar to how Newtonian mechanics is "wrong" at quantum scales yet sufficient for lunar trajectories, effective theories for AI are wrong about the full world but correct enough for targeted outcomes within operational regimes.

Platform (Amigo): The comprehensive set of foundational architecture (like Context Graphs and Layered Memory), tools, and methodologies provided by Amigo, enabling enterprises to build, deploy, manage, and iteratively align their own AI agents, typically through a Partnership Model.

Agent Forge: A synchronization and management infrastructure that enables programmatic control of Amigo platform entities through declarative JSON assets. Agent Forge provides the foundational tooling that allows coding agents to recursively optimize other agents by systematically modifying configurations for agents, context graphs, dynamic behaviors, and evaluation frameworks. It features bi-directional synchronization between local files and remote platform configurations, multi-environment support for safe staging and deployment, and comprehensive entity management across the entire Amigo ecosystem.

Recursive Meta-Optimization: The process where coding agents use Agent Forge's infrastructure to autonomously optimize other agents' configurations. This involves analyzing performance data, proposing improvements, modifying declarative JSON assets, and deploying changes through systematic testing cycles. Unlike manual optimization that operates at human timescales, recursive meta-optimization enables system evolution at machine speed while maintaining safety boundaries.

Declarative Entity Management: The approach used by Agent Forge to represent all agent system components as versioned JSON files that can be programmatically modified. This includes agents (identity, communication patterns), context graphs (problem topology, reasoning paths), dynamic behaviors (triggers, responses), and evaluation frameworks (metrics, personas, scenarios). The declarative approach enables coding agents to reason about and modify agent architectures systematically while maintaining version control and rollback capabilities.

Context Graph Framework

Context Graph: See definition under Platform & Core Concepts.

Topological Field: The fundamental structure of context graphs that creates gravitational fields guiding agent behavior toward optimal solutions rather than prescribing exact paths.

Context Density: The degree of constraint in different regions of a context graph, ranging from high-density (highly structured, low entropy) to low-density (minimal constraints, high entropy). High-density regions provide structured protocols for reliability, medium-density regions offer guided frameworks for operational workflows, and low-density regions enable creative exploration. This variable constraint approach implements entropy control at the context graph level.

State: The core building block of a context graph that guides agent behavior and decision-making, including action states, decision states, recall states, reflection states, and side-effect states.

Side-Effect State: A specialized context graph state that enables agents to interact with external systems, triggering actions like data retrieval, tool invocation, alert generation, or workflow initiation beyond the conversation itself.

Gradient Field Paradigm: Approach allowing agents to navigate context graphs like expert rock climbers finding paths through complex terrain, using stable footholds, intuition, and pattern recognition.

Problem Space Topology: The structured mapping of a problem domain showing its boundaries, constraints, and solution pathways, which guides how agents approach and solve problems.

Topological Learning: Process by which agents continuously enhance navigation efficiency across context graphs by learning from prior interactions and adjusting strategies accordingly.

Quantum Patterns: Fundamental units of state transitions in context graphs that represent complete interaction cycles. Each quantum always starts and ends on action states, with arbitrary internal processing between them. Examples include simple patterns like [A] → [A] (direct response) and complex patterns like [A] → [C] → [R] → [D] → [A] (memory-informed, reflection-guided decision).

Three-Level Navigation Framework: The cognitive architecture enabling agents to traverse context graphs with genuine understanding:

Description Level (Conceptual): The "why" - service philosophy and approach providing sparse global understanding
Abstract Topology Level (Structural): The "what" - zoomed-out map of all states and transitions
Local Guidelines Level (Operational): The "how" - dense, detailed instructions for current state execution

Action State Guarantee: The fundamental rule that agent traversals always start and end on action states. Agents can take an arbitrary number of internal steps (decision, reflection, recall states) before responding, but users only interact with the agent at action states. This ensures coherent responses while hiding internal complexity.

State Quantas: The smaller units of actions that can compose within individual states. For example, an action state might internally execute multiple tool calls, each representing a quantum of functionality within that state.

Multi-State Traversal: The capability for agents to navigate through multiple internal states between user interactions. This hidden journey enables sophisticated reasoning, memory operations, and decision-making while maintaining seamless conversation flow. Users see only the action state responses, not the complex internal processing.

Sparse-Dense Resolution: The multi-resolution approach in context graphs where agents have access to both sparse global views (conceptual description and abstract topology) and dense local resolution (detailed state guidelines). This enables strategic navigation with global awareness while maintaining precise local execution.

Context Detection: Process identifying conversational patterns, emotional states, user intent, and situational contexts during dynamic behavior selection, evaluating both explicit statements and implicit expressions of user needs across the full conversation history.

Memory Architecture

Functional Memory System: Amigo's approach to memory that maintains sufficient statistics—compressed representations preserving all information relevant to outcomes while discarding noise. Memory operates as part of unified context C, combining with professional identity (interpretation priors) and context graphs (problem structure) to enable decisions.

Layered Memory Architecture: See definition under Platform & Core Concepts.

L0 Raw Transcripts Layer: Complete unfiltered conversation history serving as ground truth. The only source for discovering unexpected patterns during recontextualization.

L1 Information Gain Layer: Extracts what's genuinely new—deviations from L3's current understanding. Captures all changes, including seemingly irrelevant details that may later reveal patterns through temporal aggregation.

L2 Episodic Patterns Layer: Accumulated L1 information synthesized over weeks/months. Temporal aggregation at this layer reveals recurring patterns invisible at shorter timescales (e.g., 2-3 week cycles in medication adherence correlating with work stress).

L3 Functional Dimensions Layer: Stable patterns discovered through cross-episode analysis. Contains sparse functional dimensions that explain substantial outcome variance. Remains constantly in memory during live sessions, providing immediate context without retrieval latency.

Professional Identity (N): The agent's foundational expertise and interpretive lens that shapes how information is understood and prioritized. A cardiologist identity emphasizes cardiac history and medication interactions, while a physical therapist identity emphasizes injury biomechanics and movement patterns. This identity provides interpretation priors that, combined with functional dimensions from L3 and problem structure from context graphs, form unified context C for decisions.

User Model: L3's representation providing functional dimensions ( $Z_t$ ) that, combined with professional identity and problem structure, form unified context C for decisions. Operational center defining dimensional priorities, orchestrating how information flows, is preserved, retrieved, and interpreted.

Dimensional Framework: The structure in the user model defining information categories with associated precision requirements and contextual preservation needs. Shaped by professional identity—a cardiologist's framework emphasizes cardiac history and medication interactions, while a physical therapist's emphasizes injury biomechanics and movement patterns. Serves as blueprint determining what information requires outcome-sufficient preservation (sufficient statistics), how context is maintained, and when information needs recontextualization.

Functional Dimensions: The sparse stable patterns maintained in L3 that drive outcomes. Discovered through temporal aggregation and cross-episode analysis rather than imposed by design. Also called outcome-relevant dimensions. These emerge because true causal structure is sparse—work stress patterns, circadian rhythms, medication adherence styles generalize across populations while noise averages out.

Latent Explanatory Variables: Variables that only become visible through temporal aggregation over longer horizons. Daily fluctuations appear random, but monthly accumulation reveals cycles, correlations, and causal patterns. Example: You cannot detect a monthly stress-medication cycle from daily snapshots—the pattern emerges only through weeks of data accumulation in L2. Critical for dimensional discovery: unfiltered L1 extraction accumulates all changes, L2 synthesis aggregates over episodes, cross-episode analysis reveals which patterns generalize as stable L3 dimensions. These variables explain outcome variance that appears unexplained at shorter timescales.

Sufficient Statistics: Compressed representations that preserve all information relevant to outcomes while discarding noise and redundancy. Mathematical foundation for hierarchical memory architecture—each layer maintains sufficiency (preserving predictive information) while increasing compression. See Information Theory & Mathematical Foundations for formal definition.

Latent Space: The multidimensional conceptual space within language models containing encoded knowledge, relationships, and problem-solving approaches. Effectiveness of AI is determined by activating the right regions of this space rather than simply adding information.

Knowledge Activation: The process of priming specific regions of an agent's latent space to optimize performance for particular tasks, ensuring the right knowledge and reasoning patterns are accessible for solving problems.

Implicit Recall: Memory retrieval triggered by information gap detection during real-time conversation analysis.

Explicit Recall: Memory retrieval triggered by predetermined recall points defined in the context graph structure.

Recent Information Guarantee: Feature ensuring recent information (last n sessions based on information decay) is always available for full reasoning.

Targeted Search Mechanism: Process identifying specific information gaps using the user model and conducting targeted searches near known critical information with L3 anchoring.

Information Evolution Handling: System for managing changing information through checkpoint + merge operations, accumulating observations by dimension over time. When dimensions evolve, backfill enables reinterpretation of entire history through improved dimensional framework.

Backfill: Process of replaying raw observational traces under an updated dimensional blueprint to regenerate statistics and confirm that causal contracts still hold. When population-level cohorts flag that our information buckets or interpretive lenses are drifting, we must regenerate the sufficient statistics from raw traces under the updated blueprint. Systems that skip this step end up reasoning with stale compressions that quietly encode yesterday's mistakes. The blueprint remains a living hypothesis; only after several rounds of measurement, rewrite, and backfill do we approach sufficiency. Each replay under an improved blueprint tests whether the chosen dimensions are rich enough to support the causal inferences we care about.

Temporal Aggregation: The process by which patterns invisible at short timescales emerge through accumulation over longer horizons. Daily fluctuations look random, but monthly accumulation reveals cycles and correlations. Critical mechanism for discovering latent explanatory variables—you cannot detect monthly cycles from daily snapshots.

Cross-Episode Analysis: Comparing multiple L2 episodic models with L3 anchoring to discover which patterns generalize versus which are coincidental. A stress-medication interaction appearing once might be chance; appearing in three quarterly episodes reveals a stable functional dimension.

Boundary Loss Prevention: L3 anchoring ensures that merging episodic models doesn't lose information at episode transitions. Balances finding shared patterns (cross-episode coherence) with preserving current understanding (preventing divergence from L3). Like maintaining a stable reference point while charting new territory.

Unified Context (C): The complete context for decisions, assembled from multiple sources: Context Graphs (T, problem structure), Professional Identity (N, interpretation priors), Functional Memory (M, sufficient statistics), Constraints (K, safety limits), Evaluations (E, success criteria). Formally defined through predictive sufficiency: $C$ is sufficient for outcome $Y$ if $P(Y|H) = P(Y|C)$ where $H$ is the joint human + environment state. L3 provides functional dimensions $Z_t$ that form memory's contribution to unified context. This unified representation enables the system to make decisions based on outcome-relevant information without requiring complete modeling of the joint human-environment state.

Information Theory & Mathematical Foundations

Information Bottleneck Principle: Mathematical framework for discovering outcome-relevant dimensions by maximizing $I(Z;Y) - \beta I(Z;X)$ where $Z$ are discovered dimensions, $Y$ is the outcome, and $X$ are observations. The principle balances predictive power about outcomes (maximize $I(Z;Y)$ ) against complexity of representation (minimize $I(Z;X)$ ), with $\beta$ controlling the trade-off. Applied to hierarchical memory: L1→L2→L3 compression discovers minimal sufficient statistics for outcomes. The bottleneck naturally identifies which dimensions matter—dimensions that don't improve outcome prediction get compressed away. Provides theoretical foundation for why sparse manifolds exist: outcome-relevant structure admits simpler representations than full observation space.

Rate-Distortion Theory: Information-theoretic framework formalizing the trade-off between compression rate $R$ (bits used) and distortion $D$ (prediction error). For Gaussian sources: $R(D) = \frac{1}{2}\log_2(\sigma^2/D)$ where $\sigma^2$ is signal variance. Applied to memory architecture: each layer $\mathcal{F}_i$ achieves different rate-distortion operating points—L0 has zero distortion (complete transcripts), L3 has high compression rate (sparse dimensions) with low distortion on outcome prediction. Connects to Minimum Description Length principle: best model minimizes description length plus prediction error. Guides memory compression decisions by quantifying achievable sufficiency at each compression level.

Sufficient Statistics: Compressed state representing exactly the information needed to complete the current problem quantum and set up the next quantum correctly. A statistic $T(X)$ is sufficient for parameter $\theta$ if $P(\theta|X) = P(\theta|T(X))$ —knowing $T(X)$ provides all information $X$ contains about $\theta$ . Extended to outcomes: $Z$ are sufficient statistics for $Y$ if $P(Y|H) = P(Y|Z)$ where $H$ is the joint human + environment state. These statistics instantiate the entry contracts—when the orchestration layer evaluates whether a cohort sits inside an arc's validated domain, it does so by inspecting this compressed state. Missing or stale statistics are contract violations that force the planner to reroute or collect more measurement before committing to the arc.

Causal Sufficiency: A representation $Z$ is causally sufficient for outcome $Y$ when interventions based on $Z$ alone achieve the same results as interventions based on the full joint human-environment state $H$ . Formally: $P(Y|do(Z)) = P(Y|do(H))$ where $do(\cdot)$ denotes causal intervention. Explains why sparse representations enable effective action, not just prediction—medication adherence interventions based on discovered stress patterns and environmental triggers achieve same results as interventions with complete models of psychological state and life circumstances. Distinguishes sufficient statistics (correlational) from causal sufficiency (interventional). Systems must verify causal sufficiency through real-world deployment, not just predictive accuracy.

Effective Rank: Spectral measure quantifying true dimensionality of a representation by accounting for eigenvalue distribution: $\exp(-\sum_i p_i \log p_i)$ where $p_i = \lambda_i / \sum_j \lambda_j$ are normalized eigenvalues. Unlike nominal dimensionality (counting parameters), effective rank reveals emergent sparsity—a 1000-dimension space with effective rank 20 means 20 directions capture most variance. Applied to L3 functional dimensions: validates that discovered dimensions genuinely exhibit sparse structure. Also used in analyzing learned model representations to identify which dimensions are information-rich versus redundant. Quantifies the "sparsity" in dimensional sparsity principle.

Value of Information (VOI): Decision-theoretic framework for determining when to query memory or gather more information: $U(q|C_t) = \mathbb{E}[\Delta R_u] - \lambda \cdot \text{Cost}(q) - \mu \cdot \text{Risk}_{\text{latency}}(q)$ . Gates memory expansion decisions by comparing expected outcome improvement against query cost and latency risk. Applied in implicit recall: only retrieve when information gain justifies cost. Enables efficient context management at scale—not every question requires deep memory search. Connects information theory (measuring information gain) with economic constraints (computation and latency budgets).

Integration Bridges

Memory‑Reasoning Bridge: The mechanism that delivers information at the appropriate granularity (L0, L1 or L2) exactly when the reasoning engine needs it, overcoming the token‑window constraint and enabling multi‑step, long‑horizon reasoning.

Knowledge‑Reasoning Integration: The coupling that ensures knowledge activation directly reshapes the problem space being reasoned about rather than serving as passive retrieval.

Memory‑Knowledge‑Reasoning Integration: The broader Agent V2 goal of maximizing bandwidth across all three systems so that the agent can freely zoom between abstraction levels while preserving context.

Processing Methods

Live-Session Processing: Top-down memory operation during live interactions, primarily accessing the user model (L3) for immediate dimensional context.

Post-Processing Memory Management: Efficient cycle ensuring optimal memory performance through session breakpoint management, L0→L1 transformation, checkpoint + merge pattern, and L1→L2 synthesis.

Causation Lineage Analysis: Analytics mapping developmental pathways in user behaviors and outcomes across time to identify formative experiences leading to specific outcomes.

Dimensional Analysis: Evaluation of patterns across user model dimensions to identify success factors and optimization opportunities.

Metrics and Pattern Discovery

Drift: System performance or behavior changes over time as reality diverges from training/verification conditions. In multi-objective framework, drift manifests as movement on or evolution of the Pareto frontier. Three types: Input drift (new scenarios arrive shifting scenario distribution, requiring different position on frontier for optimal multi-objective satisfaction), Prediction drift (model's position on frontier shifts as performance profile changes—accuracy improving while latency degrading indicates frontier movement), Dimensional drift (new functional dimensions discovered through temporal aggregation cause acceptance region $A_U$ to expand, fundamentally changing what "success" means). Detected through admissibility margin monitoring—shrinking margin signals drift before hard failures occur. Managed through Observable Problem → Verification Cycle with escalation protocol: immediate review if safety-critical, short-term uncertainty widening, medium-term targeted data collection, long-term dimensional refinement or retraining.

Metrics & Simulations Framework: System providing objective evaluation of agent performance through configurable criteria and simulated conversations.

Metric: A configurable evaluation criteria to assess the performance of an agent. Metrics can be generated via custom LLM-as-a-judge evals on both real sessions and simulated sessions + unit tests.

Simulations: Simulations describe the situations you want to test programmatically. A simulation contains a Persona and Scenario.

Persona: The user description you want the LLM to emulate when running simulating conversations

Scenario: The scenario description you want the LLM to create when simulating conversations

Unit Tests: Combination of simulations with specific metrics to evaluate critical agent behaviors in a controlled environment.

Feedback Collection: Process of gathering evaluation data through human evals (with scores and tags) and memory system driven analysis. These datasets are exportable with filters for data scientists to generate performance reports.

Pattern Discovery and Optimization: System enhancing agent behaviors through measurement-driven discovery of successful patterns, ensuring alignment with organizational objectives. In Amigo, this is a core part of the Continuous Alignment Loop, leveraging real-world data (via the Partnership Model) and verification to identify configurations that optimize across multiple correlated objectives. Rather than propagating rewards through trajectories, the system directly measures and extracts successful patterns for reuse.

Quantized Arc: A reusable reasoning primitive that expects a defined bundle of sufficient statistics at entry, transforms them through a scoped operation, and emits an exit state that subsequent arcs can accept. Each arc carries explicit contracts: entry predicates encode sufficient-statistic requirements that must be satisfied, while exit contracts specify guarantees and variance bounds. The same arc can accelerate progress for one cohort while destabilizing another—cohorts are compact regions in sufficient-statistic space that share causal response profiles. Arc libraries provide the building blocks for composing new behaviors without re-running full trajectories.

Arc-Cohort Ledger: The matrix of effect signatures, sample counts, and causal justifications for every arc across the cohorts we can measure. Each episode deposits blueprint-governed evidence into an episodic cluster; population-level cohort audits test for blind spots. The ledger tracks how often each arc contributes to verified successes, where it fails, and how broadly it transfers across contexts. When gaps appear, we rewrite the blueprint and replay the raw logs, keeping the arc contracts honest. This drives promotion, retirement, and prioritisation during pattern discovery—no scalar rewards need to propagate through entire rollouts.

Structural Equivalence Class: Family of quantized arcs that impose the same guard-rails and effect signatures on the optimisation object, even as starting states or coordinate frames drift. Two arcs belong to the same class when they absorb similar input defects, impose the same guard-rails, and deliver comparable deltas on the sufficient statistics, despite running on distinct concrete states. Once the blueprint specifies which signals to extract and how to bucket them, we can compare how different arcs reshape those measurements—even when absolute values drift between episodes. Measurement is what lets us detect those shared effect signatures instead of guessing, and the blueprint is subsequently amended to record the invariants that make the class reusable.

Reward-Driven Optimization: Training approach where agents receive explicit rewards or penalties, guiding incremental improvements toward optimal behaviors.

Adversarial Testing Architecture: An evaluation architecture where specialized judge and tester agents challenge the primary agent against defined scenarios, metrics, and thresholds to drive targeted optimization. These judge and tester agents may utilize more computational resources or specialized models to ensure rigorous evaluation.

Compute-Scaled Reasoning: Reasoning that scales with inference-time compute through beam search, tree search, or Monte Carlo Tree Search (MCTS) rather than purely through model parameter scaling. Enables systems to "think longer" on hard problems by exploring multiple solution paths and pruning unpromising branches. Key enabler of reasoning phase scaling—returns remain strong as compute increases because verification provides training signal. Contrasts with pre-training (saturating returns) and post-training (limited returns). Combined with verifiable rewards, allows systems to discover solutions beyond their immediate generative capabilities through systematic search.

Verification-Driven Optimization: An approach where agents improve through systematic verification of outcomes against predefined success criteria, using external environments, oracles (e.g., code executors), or measurement frameworks. This enables learning in complex domains where explicit supervision of every step is impractical. The verification bottleneck—our ability to verify solutions faster than we can generate them—enables scaling: search over solution space guided by verification, avoiding the need to enumerate all reasoning paths explicitly. In Amigo, this manifests as direct pattern extraction from verified successful outcomes rather than reward propagation.

Self-Play Reasoning: A learning process where an AI agent improves its reasoning capabilities by generating its own tasks or problems and learning to solve them, often in an iterative loop with itself or versions of itself. This allows the agent to explore and master a problem space more autonomously, potentially discovering novel strategies and achieving higher levels of performance without constant external guidance or pre-defined datasets.

Acceptance Region: The multi-dimensional zone where economic work unit outcomes count as successful. (Notation: $A_U$ ) Unlike single-metric thresholds, acceptance regions capture how success actually works—you need to satisfy multiple correlated objectives simultaneously, not just one. The acceptance region evolves as the system discovers which dimensions actually drive desired outcomes through dimensional discovery and the macro-design loop.

Pareto Frontier: The boundary of what's achievable when optimizing multiple objectives—the set of solutions where improving one objective requires degrading another. Configuration A might excel at accuracy but sacrifice empathy and speed. Configuration B might optimize for empathy with lower accuracy. Neither beats the other on all dimensions, so both sit on the frontier. Moving along the frontier means making explicit trade-offs between correlated objectives, with real costs in computation, latency, and development effort. Evaluations reveal the achievable frontier for your problem domain, helping you choose where to operate based on your priorities rather than chasing a non-existent single "best" solution.

Admissibility Margin: A risk-aware metric measuring how robustly an outcome satisfies the multi-objective acceptance region. (Notation: $M_\alpha$ ) A larger margin means outcomes stay safely inside the acceptance region even in worst-case scenarios across all objectives, not just on average. Two agents might both achieve high accuracy on average, but one consistently performs near the top of its range while the other has wide variance—the consistent one has larger admissibility margin. The system uses risk-aware scoring (like CVaR—Conditional Value at Risk) to measure "how far inside, and how reliably?" rather than just "are we inside?" This prevents fragile configurations that meet thresholds on average but frequently violate them under realistic conditions.

Multi-Objective Optimization: Optimization framework where success requires simultaneously satisfying multiple correlated objectives rather than maximizing a single metric. Each economic work unit gets evaluated across organization-specific dimensions discovered through verification. These objectives interact—improving one often degrades others. The system must navigate these trade-offs to land inside the acceptance region while maintaining admissibility margin. Related to Pareto optimization where no single solution dominates on all dimensions. Traditional approaches that treat objectives as independent or collapse them into a single score miss fundamental correlations and lead to suboptimal decisions.

Correlated Objectives: Multiple evaluation dimensions that interact and influence each other rather than varying independently. Increasing reasoning depth improves accuracy but degrades latency. Higher empathy scores may reduce clinical directiveness. More comprehensive safety checks increase operational cost. Stricter verification improves reliability but reduces system willingness to engage edge cases. Understanding these correlations matters for multi-objective optimization—treating objectives as independent leads to configurations that optimize individual metrics but fail on overall value delivery. Evaluations reveal objective correlations through systematic exploration of configuration space, showing actual achievable trade-offs rather than theoretical independence assumptions.

Verified Dimensional Impact: Sensitivity analysis quantifying which functional dimensions most affect admissibility margin within the acceptance region. Computed through variance decomposition showing which dimensions in the sparse scenario space drive outcomes. Connects memory's dimensional discovery (identifying candidate dimensions through temporal aggregation) with verification (measuring which dimensions matter for acceptance region satisfaction). Not all discovered dimensions have equal impact—verified dimensional impact quantifies which to prioritize for optimization. Informs resource allocation by revealing high-impact dimensions worth improving versus low-impact dimensions where effort yields minimal return.

Difficulty Index (D): Work-unit difficulty metric based on predictive uncertainty (model confidence), epistemic uncertainty (how well-explored the scenario space), verification cost (computational resources required), and branching factor (solution space complexity). Used for entropy-based pricing where harder problems (high $D$ ) consume more computational resources and justify higher costs. Enables transparent pricing models where cost correlates with problem complexity rather than flat per-query fees. Computed per work unit, aggregated across OPD to quantify operational difficulty profiles. Helps organizations understand where systems face challenges and where optimization efforts would have most impact.

Confidence Accounting: Framework for tracking and reporting decision confidence across capabilities and OPDs. Each decision receives quantized confidence score with explicit uncertainty. Aggregated per capability type (diagnosis, recommendation, assessment) and per OPD with distributional reporting (not just averages—full confidence distributions). Enables insurance-ready evidence by providing statistical basis for reliability claims. When confidence distributions shift (e.g., 95th percentile drops below threshold), triggers drift detection and escalation protocols. Supports systematic capability expansion: new capabilities start with conservative confidence requirements, expanding as evidence accumulates.

Frontier Expansion vs Movement: Two types of optimization improvements with fundamentally different resource costs. Movement along frontier trades one objective for another (sacrifice some accuracy for substantial empathy improvement) requiring moderate compute reallocation. Frontier expansion improves multiple objectives simultaneously (better accuracy AND empathy) requiring architectural innovations—better context engineering, improved reasoning strategies, or domain-specific fine-tuning—with high development cost. Evaluations reveal current frontier position; Agent Forge explores whether movement or expansion opportunities exist. Expansion shifts what's fundamentally achievable; movement optimizes within current constraints.

Dimensional Drift: Type of drift where functional dimensions themselves evolve—new dimensions discovered through temporal aggregation that drive outcomes, causing acceptance region $A_U$ to expand. Example: Nutrition coaching starts with (diet restrictions, budget, time) but over time discovers (emotional relationship with food, social eating context, stress patterns) through population analysis. Acceptance region expands to include newly discovered dimensions. Agents satisfying the original $A_U$ may no longer satisfy evolved $A_U$ with additional dimensions. Distinct from input drift (new scenarios arrive) or prediction drift (model degrades). Managed through macro-design loop where problem definition P evolves as understanding deepens.

Multi-Objective Optimization Target: In pattern discovery, the optimization target that accounts for correlated objectives simultaneously rather than a single scalar metric. The system optimizes admissibility margin $M_\alpha$ measuring robust satisfaction of acceptance region $A_U$ across all objectives. Traditional approaches maximize expected value; Amigo's approach maximizes $M_\alpha(a|C)$ which respects trade-offs between organization-specific dimensions. The system discovers through measurement which configurations improve margin across all objectives, how to navigate trade-offs when objectives correlate negatively, and when frontier expansion is possible versus movement required. This creates pressure toward balanced optimization rather than narrow maximization that sacrifices critical dimensions.

Iterated Distillation and Amplification (IDA)

A framework for systematically improving AI capabilities through iterative cycles. It involves two main phases:

Amplification Phase: Using significantly more computational resources (e.g., extended reasoning time, parallel processing, external tools, human feedback, large-scale simulation) to generate higher-quality outputs or problem solutions than the base model could achieve alone. This creates high-quality training data demonstrating superior performance.
Distillation Phase: Training a new, more efficient model to mimic the superior behavior demonstrated during the amplification phase, but using substantially fewer computational resources during operation. The goal is to internalize the improved capabilities. This cycle (Base Model -> Amplification -> Distillation -> New Base Model) can be repeated to achieve progressive performance gains.

Actions & Execution

Actions: The execution layer of Amigo's unified cognitive architecture representing quantum-level units through which agents affect and interact with external systems. Actions operate as intelligent primitives that can be dynamically composed and orchestrated based on context, spanning from high-entropy creative exploration to low-entropy deterministic execution while maintaining entropy stratification (see Entropy Control for formal definition).

Compositional Intelligence: The ability to combine simple action primitives into complex behaviors that exhibit emergent capabilities. Through Agent Forge's declarative framework, coding agents can programmatically create new action patterns by analyzing performance data and building sophisticated problem-solving architectures from fundamental building blocks.

Operational Patient Domain (OPD): Bounded operating specification defining where an AI system is authorized and capable of operating autonomously. Components: (1) Inclusions/exclusions—explicit scenarios within/outside system competence; (2) Capability confidence targets—required confidence levels per capability type; (3) Escalation policies—handoff protocols when confidence insufficient or scenario excluded; (4) Versioned artifacts—OPDs tracked as versioned specifications enabling controlled expansion. Implements regime-bounded validity by formalizing operational boundaries. Similar to how autonomous vehicles define operational design domains (highway vs city vs dirt road), OPDs define where AI systems maintain sufficient reliability. As dimensional discovery expands functional understanding, OPDs can be systematically expanded to adjacent domains where learned structure transfers.

Economic Work Units: Human-oriented coherent units of value-delivered economic work that solve real problems for organizations. These represent verifiable business outcomes that can be measured across multiple dimensions—both verifying sub-components are correct and assessing whether the overall deliverable meets the intended business value. Each work unit gets evaluated across organization-specific correlated objectives discovered through verification. Success is defined by an acceptance region—the work unit must satisfy all objectives simultaneously, not just one. Work units carry SLOs that formalize these multi-dimensional requirements, with violations triggering escalation. The acceptance region evolves as the system discovers which dimensions actually drive value delivery through the macro-design loop.

Action Primitives: Discrete capabilities that serve as building blocks for complex behaviors. Each primitive is optimized for its specific entropy level—whether handling tasks within the model's sweet spot or delegating to specialized computational methods for tasks outside its optimal range—and can be combined with others to create workflows that would be impossible with traditional rigid tooling.

Serverless Action Architecture: The execution model where actions deploy through serverless infrastructure with custom runtime environments, enabling elastic scaling, isolation boundaries, version management, and cost optimization while maintaining enterprise-grade security and reliability. Each action can specify its own computational environment, including specialized libraries, programming languages, and performance configurations optimal for its specific task.

Multi-Agent & Game-Theoretic Concepts

Strategic Manifold Sufficiency: Extension of dimensional sparsity principle to multi-agent environments. Agent $i$ 's representation $Z_i$ is strategically sufficient if outcome predictions conditioned on $Z_i$ and other agents' actions match predictions using full state: $P(Y|H_i, a_{-i}) \approx P(Y|Z_i, a_{-i})$ where $a_{-i}$ represents other agents' actions. Explains why effective coordination doesn't require modeling complete psychology of all participants—sufficient to model strategically relevant dimensions. Healthcare teams coordinate through shared functional understanding (patient state, treatment goals, constraints) without complete mutual models. Organizational AI systems achieve alignment through sparse shared representations rather than exhaustive world models.

PreviousEvolution Management NextIndustry Implementation Guides

Last updated 1 month ago

Was this helpful?

Good afternoon