Architecture Overview

What is keep?

keep is a reflective memory system. It gives agents a comprehensive tool for persistent indexing, tagging, entity relationship management, summarization, semantic and timeline analysis, and powerful contextual recall. It's designed as an agent skill for Claude Code, OpenClaw, LangChain/LangGraph, and other agentic environments, enabling agents to remember information across sessions over time.

Published by Hugh Pyle, "inguz ᛜ outcomes", under the MIT license. Contributions are welcome; code is conversation, "right speech" is encouraged.


Core Concept

Every stored note has:

The original document content is not stored — only the summary and embedding.


Architecture

keep is layered. Surface clients (CLI, MCP, LangChain, Claude Desktop bundle) are thin wrappers that talk to a long-running daemon over HTTP. The daemon hosts a Keeper, which composes provider, store, action, and flow modules. Background work runs out-of-band on the daemon's queues.

┌────────────────────────────────────────────────────────────────────────┐
│  Surface clients                                                       │
│  ┌──────────┐  ┌─────────┐  ┌──────────────┐  ┌────────────────────┐   │
│  │ cli_app  │  │  mcp.py │  │ langchain/   │  │ mcpb.py (Claude    │   │
│  │ (typer)  │  │ (stdio) │  │ adapters     │  │ Desktop bundle)    │   │
│  └────┬─────┘  └────┬────┘  └──────┬───────┘  └──────┬─────────────┘   │
└───────┼─────────────┼──────────────┼─────────────────┼─────────────────┘
        │             │              │                 │
        │  HTTP (loopback, token-auth, host-header guarded)
        ▼             ▼              ▼                 ▼
┌────────────────────────────────────────────────────────────────────────┐
│  Daemon (daemon.py / daemon_server.py / daemon_client.py)              │
│  Routes:  /v1/notes, /v1/notes/{id}, /v1/notes/{id}/tags,              │
│           /v1/notes/{id}/context, /v1/search, /v1/flow,                │
│           /v1/analyze, /v1/ready, /v1/health, /v1/admin/*              │
└──────────────────────────────┬─────────────────────────────────────────┘
                               │
                               ▼
┌────────────────────────────────────────────────────────────────────────┐
│  Keeper (api.py)                                                       │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │  api.py:Keeper = ProviderLifecycleMixin                          │  │
│  │                + BackgroundProcessingMixin                       │  │
│  │                + SearchAugmentationMixin                         │  │
│  │                + ContextResolutionMixin                          │  │
│  └──────────────────────────────────────────────────────────────────┘  │
│  Implements high-level put/find/get/tag/move/delete/revert/analyze.    │
│  Many user-visible operations are dispatched through actions/.       │
│  Stable execution boundary: run_flow() over named state docs.        │
└────────────┬───────────────────┬────────────────────┬──────────────────┘
             │                   │                    │
             ▼                   ▼                    ▼
   ┌──────────────────┐ ┌─────────────────┐ ┌───────────────────────────┐
   │  Providers       │ │ Storage         │ │ Background work           │
   │  (providers/)    │ │ backends        │ │                           │
   │  embedding /     │ │ DocumentStore   │ │ pending_summaries.py      │
   │  summarization / │ │ (SQLite)        │ │ work_queue.py /           │
   │  document /      │ │ ChromaStore     │ │ work_processor.py         │
   │  media / OCR /   │ │ (vectors)       │ │ task_client.py            │
   │  analyzer        │ │ PendingQueue    │ │ (hosted delegation)       │
   │                  │ │ → backend.py    │ │ planner_stats.py          │
   └──────────────────┘ └─────────────────┘ └───────────────────────────┘

Layers

1. Surface clients

cli_app.py — Typer command app

The CLI is intentionally thin for ordinary note operations (put, get, find, tag, flow execution): it resolves shell concerns, sends daemon HTTP requests, and renders responses. Commands that need direct process control or local filesystem traversal remain CLI-owned for now: setup/config discovery, daemon lifecycle, MCP stdio startup, bulk directory ingestion, and data import/export. Those commands may construct a local Keeper or use local graph helpers, but that is an explicit exception to the daemon-backed command path.

mcp.py — MCP stdio server

langchain/ — Framework adapters

mcpb.py — Claude Desktop bundle

2. Daemon layer

daemon.py — Daemon entry point

daemon_server.py — HTTP query server

daemon_client.py — Daemon discovery and HTTP

The CLI and MCP layers each have their own retry-on-disconnect logic so they can gracefully follow a daemon that has restarted on a new port.

3. Keeper (core API)

api.py — Main facade

actions/ — Action implementations A package of focused modules implementing user-visible operations behind Keeper/flows:

analyze     find_supernodes     ocr             resolve_meta
auto_tag    generate            put             resolve_stubs
delete      get                 resolve_duplicates  stats
describe    list_parts          resolve_edges   summarize
extract_links list_versions     traverse        tag
find        move                                 ...

Most are dispatched from state-doc flows; some are still called directly from Keeper methods during the migration to flows.

protocol.py — Abstract interfaces

flow_client.py — Shared flow-backed wrappers

remote.py — Remote client

4. Flow runtime (state docs)

The Keeper exposes run_flow(state, params, ...) as its stable execution boundary. A "state" is a named YAML state-doc that declares rules, predicates, and actions. The runtime evaluates them and dispatches actions from actions/ against the Keeper.

state_doc.py — Loader, compiler, evaluator

state_doc_runtime.py — Synchronous runtime

system_docs.py / builtin_state_docs.py — System doc inventory

flow_env.py — Local flow execution environment

Flows that must complete before returning to the caller (find, get-context, deep-find) run synchronously in this runtime. Write-side flows can suspend and continue on the background work queue.

5. Background work

pending_summaries.py — Pending task queue

work_queue.py / work_processor.py — Direct work queue

processors.py — Content processing helpers

task_client.py / task_workflows.py — Hosted task delegation

planner_stats.py — Flow discriminator priors

recovery.py — DB recovery

6. Storage backends

document_store.py — Document persistence (local)

store.py — Vector persistence (local)

backend.py — Pluggable storage factory

paths.py / config.py — Paths and config

7. Providers

All providers register through providers/base.py:ProviderRegistry. The registry is populated lazily on first use so optional dependencies don't break startup.

Embedding Providers

Generate vector representations for semantic search.

Dimension is determined by the model and must be consistent across indexing and queries. Embeddings are cached through providers/embedding_cache.py (embedding_cache.db).

openai also supports base_url for local or self-hosted OpenAI-compatible servers such as llama.cpp llama-server, vLLM, LM Studio, or LocalAI. That is distinct from the openrouter provider, which has its own model naming and headers even though both use the OpenAI SDK underneath.

Summarization Providers

Generate human-readable summaries from content.

Contextual Summarization. When documents have user tags (domain, topic, project, etc.), the summarizer receives context from related items. This produces summaries that highlight relevance to the tagged context rather than generic descriptions.

How it works:

  1. When processing pending summaries, the system checks for user tags
  2. Finds similar items that share any of those tags (OR-union)
  3. Boosts scores for items sharing multiple tags (+20% per additional match)
  4. Top 5 related summaries are passed as context to the LLM
  5. The summary reflects what's relevant to that context

Example: indexing a medieval text with domain=practice produces a summary highlighting its relevance to contemplative practice, not just "a 13th-century guide for anchoresses."

Tag changes trigger re-summarization. When user tags are added, removed, or changed on an existing document, it's re-queued for contextual summarization even if content is unchanged. The existing summary is preserved until the new one is ready.

Non-LLM providers (truncate, first_paragraph, passthrough) ignore context.

Document Providers

Fetch content from URIs with content regularization.

Content regularization:

Provider-extracted tags merge with user tags (user wins on collision). This ensures both embedding and summarization receive clean text.

Content Extractor / OCR Providers

Extract text from scanned PDFs and images via optical character recognition.

OCR runs in the background via the pending queue (keep daemon), not during put(). The flow:

  1. During put(), content regularization detects scanned PDF pages (no
  2. extractable text) or image files

  3. A placeholder is stored immediately so the item is indexed right away
  4. The pages/image are enqueued for background OCR processing
  5. keep daemon picks up the OCR task, renders pages to images, runs OCR,
  6. cleans and scores the text

  7. The full OCR text replaces the placeholder and the item is re-embedded

Design points:

Media Description Providers (optional)

Generate text descriptions from media files, enriching metadata-only content.

Media description runs in Keeper.put() between fetch and upsert. Descriptions are appended to the metadata content before embedding/ summarization, making media files semantically searchable by their visual or audio content.

Design points:

Analyzer Providers

Decompose content into structural parts with their own summaries, tags, and embeddings (analyzers.py + providers/base.py:AnalyzerProvider).

Parts are produced by analyze() and stored as their own rows in document_parts, with vectors at {id}@p{N} in the vector store.

Other provider modules


Storage Layout

store_path/                   # default: ~/.keep
├── keep.toml                 # Provider and store configuration
├── documents.db              # SQLite: summaries, tags, versions, parts, FTS
├── chroma/                   # ChromaDB persistence (vectors + metadata)
├── pending_summaries.db      # Pending queue (summarize/embed/ocr/reindex/analyze)
├── continuation.db           # Direct work queue + flow continuations
├── embedding_cache.db        # SQLite cache for embeddings
├── planner_stats.db          # Flow planner priors
├── .processor.pid            # Daemon PID file
├── .processor.token          # Daemon HTTP auth token
├── .processor.port           # Daemon HTTP port
├── .processor.version        # Code version the daemon was started under
└── keep-ops.log[.N]          # Persistent operations log (rotating)

documents.db contains the documents, document_versions, and document_parts tables (plus FTS shadow tables). The Chroma directory uses ChromaDB's own on-disk format (sqlite + parquet segment files); keep does not impose its own structure on it.


Data Flow

Indexing: put(uri=…) or put(content=…)

URI or content
    │
    ▼
┌─────────────────┐
│ Fetch / use     │ ← DocumentProvider (for URIs only)
│ input           │
└────────┬────────┘
         │ raw bytes
         ▼
┌─────────────────┐
│ Content         │ ← Extract text from HTML/PDF/DOCX/PPTX
│ regularization  │   (scripts/styles removed; scanned pages flagged)
└────────┬────────┘
         │ clean text (+ OCR page list if scanned)
         ▼
┌─────────────────┐
│ Media           │ ← Optional: vision description (images)
│ enrichment      │   or transcription (audio) appended
└────────┬────────┘
         │ enriched text
         ▼
┌──────────────────────────────────────────────┐
│ DocumentStore.upsert + placeholder summary   │
│ - tags, timestamps, content hash             │
│ - previous version archived if updated       │
└─────────────┬────────────────────────────────┘
              │
              ├─► PendingQueue.enqueue("summarize")
              ├─► PendingQueue.enqueue("embed")
              └─► PendingQueue.enqueue("ocr")  (if scanned PDF or image)
                                  │
                                  ▼
                        ┌──────────────────────┐
                        │ Background processor │
                        │ (pending_summaries / │
                        │  work_processor)     │
                        └──────────┬───────────┘
                                   │
                ┌──────────────────┼──────────────────┐
                │                  │                  │
                ▼                  ▼                  ▼
           summarize()         embed()             OCR
                │                  │                  │
                ▼                  ▼                  ▼
        DocumentStore.        VectorStore.       DocumentStore
        update_summary        upsert /           re-summarize +
                              upsert_version     re-embed

Versioning on update

Embedding dedup

Retrieval: find(query)

query text
    │
    ▼
  embed()  ← EmbeddingProvider
    │
    │ query vector
    ▼
┌───────────────────┐
│ VectorStore       │
│ query_embedding() │ ← cosine similarity search
└─────────┬─────────┘
          │
          ▼ results with distance scores
    ┌──────────────┐
    │ Apply decay  │ ← Recency weighting (ACT-R style)
    │ score × 0.5^(days/half_life)
    └──────┬───────┘
           │
           ▼
    ┌──────────────┐
    │ Date filter  │ ← Optional --since / --until
    └──────┬───────┘
           │
           ▼
    ┌────────────────────────────┐
    │ Augmentation               │ ← deep follow, RRF fusion,
    │ (SearchAugmentationMixin)  │   tag boosts (when applicable)
    └──────┬─────────────────────┘
           │
           ▼
    list[Item] (sorted by effective score)

find is also reachable via the flow runtime (find / find-deep state docs), which is the path used by MCP and the LangChain retriever.

Analyze: analyze(id)

content
    │
    ▼
┌──────────────────────┐
│ AnalyzerProvider     │ ← SlidingWindowAnalyzer (default) or
│ analyze(chunks, …)   │   SinglePassAnalyzer
└──────────┬───────────┘
           │ list[{summary, tags}]
           ▼
┌──────────────────────┐
│ Keeper.analyze       │ ← Wraps into PartInfo, persists, embeds
└──────────┬───────────┘
           │
           ├─► DocumentStore.upsert_part  (rows in document_parts)
           └─► VectorStore.upsert_part    ({id}@p{N})

Delete / Revert

delete(id) is a flat removal:

delete(id)
    │
    ▼
DocumentStore.delete + VectorStore.delete
(versions removed by default; pass delete_versions=False to keep history)

revert(id) is a separate operation that restores the previous version, or falls back to delete(id) when there is no history:

revert(id)
    │
    ▼
  max_version(id)
    │
    ├── 0 versions → delete(id)
    │
    └── N versions → restore previous
            │
            ├─ get archived embedding from VectorStore (id@vN)
            ├─ DocumentStore.restore_latest_version()
            │    (promote latest version row to current, delete version row)
            ├─ VectorStore.upsert restored embedding as current
            ├─ VectorStore.delete versioned entry (id@vN)
            └─ delete stale parts (parts of the discarded version)

delete_version(id, offset) removes a specific archived version by public selector (1=previous, -1=oldest archived, etc.).


Key Design Decisions

1. Schema as Data

2. Daemon-mediated state

3. Lazy Provider Loading

4. Separation of Concerns

5. No Original Content Storage

6. Immutable Items

7. System Tag Protection

8. Document Versioning

9. Version-Based Addressing

10. Flow as the stable boundary


LangChain / LangGraph Integration

The keep.langchain module provides framework adapters on top of the API layer:

┌─────────────────────────────────────────────────────────────┐
│  LangChain Layer (keep/langchain/)                          │
│  - KeepStore           LangGraph BaseStore adapter          │
│  - KeepNotesToolkit    LangChain tools                      │
│  - KeepNotesRetriever  BaseRetriever with now-context       │
│  - KeepNotesMiddleware LCEL runnable for auto-injection     │
└──────────────────┬──────────────────────────────────────────┘
                   │ uses Keeper API
                   ▼
┌─────────────────────────────────────────────────────────────┐
│  Keeper (api.py) → daemon → store                           │
└─────────────────────────────────────────────────────────────┘

KeepStore maps LangGraph's namespace/key model to keep's tag system via configurable namespace_keys. Namespace components become regular keep tags, visible to CLI and all query methods. Tag filtering is a **pre-filter on the vector search**, making tags suitable for data isolation (per-user, per-project). See LANGCHAIN-INTEGRATION.md.


Extension Points

New Embedding or Summarization Provider

  1. Implement the provider protocol (EmbeddingProvider or
  2. SummarizationProvider) from providers/base.py

  3. Register it in the provider registry (typically by importing your module
  4. so its register_* calls run)

  5. Reference the provider by name in keep.toml

New Analyzer

New Store Backend

New Flow / State Doc

Framework Integration