Memory System

Yunque Agent features a 5-layer memory architecture inspired by human cognitive science. Memory is the foundation for the Agent to learn from interactions and build long-term understanding.

Architecture

┌─────────────────────────────────────────────────────┐
│             Memory Orchestrator                      │
│  Ingest → Extract → Classify → Store → Promote → Decay │
├────────┬────────┬─────────┬──────────┬──────────────┤
│ Short  │  Mid   │  Long   │ Editable │ Knowledge    │
│ (conv) │ (fact) │ (rule)  │ (pinned) │   Graph      │
│ TTL    │ Decay  │ Perm.   │ Manual   │ Entity+Rel   │
│ 1 hr   │ Days   │ Forever │          │              │
├────────┴────────┴─────────┴──────────┴──────────────┤
│            Unified Recall Pipeline (5-stage)         │
├─────────────────────────────────────────────────────┤
│      Persistence: LedgerPersister + LedgerOrchPersister │
│      Storage: SQLite memories table + Ledger KV      │
└─────────────────────────────────────────────────────┘

Five Layers

1. Short-Term Memory

Current conversation context. Auto-expires after session ends (default TTL: 1 hour).

2. Mid-Term Memory

Facts and observations extracted asynchronously from conversations. Confidence decays over time — infrequently accessed memories gradually fade. TF-IDF evaluates importance.

3. Long-Term Memory

Permanent knowledge the Agent has learned about users, domains, or general patterns. Never expires.

4. Editable Memory

Memories manually added, modified, or deleted by users via the Web dashboard. Highest priority — always ranked first in recall results.

5. Knowledge Graph

Entity-relation graph capturing structured relationships between concepts, people, topics, and tasks. Supports multi-hop reasoning queries.

Recall Pipeline

5-stage pipeline when the Agent needs to recall:

Query Embedding: Convert query to vector (multiple embedding models supported)
Vector Search: ANN across all memory layers (IVF/HNSW/brute-force auto-switch)
BM25 Keyword Retrieval: Sparse keyword matching (hybrid retrieval), CJK character segmentation
Score Fusion: RRF (Reciprocal Rank Fusion) merging vector + BM25 results
Dedup & Rank: Remove duplicates, sort by final score, optional Rerank (Jina/Cohere/LLM)

Memory Promotion

Memories can be "promoted" from lower to higher layers:

Frequently accessed mid-term memories → auto-promote to long-term
LLM fact extraction results → classified by importance score

API

Method	Path	Description
GET	`/v1/memory/stats`	Per-layer statistics
GET	`/v1/memory/search?q=...`	Hybrid search
POST	`/v1/memory/add`	Add memory
POST	`/v1/memory/compact`	Compact and optimize
GET	`/v1/graph/entities`	Graph entities
GET	`/v1/graph/relations`	Graph relations
DELETE	`/v1/memory/:id`	Delete memory

Memory System ​

Architecture ​

Five Layers ​

1. Short-Term Memory ​

2. Mid-Term Memory ​

3. Long-Term Memory ​

4. Editable Memory ​

5. Knowledge Graph ​

Recall Pipeline ​

Memory Promotion ​

API ​