Architecture Overview

What is keep?

keep is a reflective memory system providing persistent storage with vector similarity search. It's designed as an agent skill for Claude Code, OpenClaw, LangChain/LangGraph, and other agentic environments, enabling agents to remember information across sessions over time.

Think of it as: vector search + embeddings + summarization + tagging wrapped in a simple API.

Published by Hugh Pyle, "inguz ᛜ outcomes", under the MIT license. Contributions are welcome; code is conversation, "right speech" is encouraged.


Core Concept

Every stored item has:

The original document content is not stored — only the summary and embedding.


Architecture

┌─────────────────────────────────────────────────────────────┐
│  API Layer (api.py)                                         │
│  - Keeper class                                             │
│  - High-level operations: put(), find(), get()              │
│  - Version management: get_version(), list_versions()       │
│  - Structural analysis: analyze()                           │
└──────────────────┬──────────────────────────────────────────┘
                   │
        ┌──────────┼──────────┬──────────┬──────────┬───────────┐
        │          │          │          │          │           │
        ▼          ▼          ▼          ▼          ▼           ▼
   ┌────────┐ ┌─────────┐ ┌────────┐ ┌────────┐ ┌─────────┐ ┌─────────┐
   │Document│ │Embedding│ │Summary │ │Media   │ │Vector   │ │Document │
   │Provider│ │Provider │ │Provider│ │Descr.  │ │Store    │ │Store    │
   └────────┘ └─────────┘ └────────┘ └────────┘ └─────────┘ └─────────┘
       │          │           │          │             │           │
   fetch()    embed()    summarize()  describe()  vectors/    summaries/
   from URI   text→vec  text→summary  media→text  search      versions

Components

api.py — Main facade

protocol.py — Abstract interfaces

store.py — Vector persistence (local)

document_store.py — Document persistence (local)

backend.py — Pluggable storage factory

remote.py — Remote client

config.py — Configuration

pending_summaries.py — Background work queue

types.py — Data model


Data Flow

Indexing: put(uri=...) or put(content)

URI or content
    │
    ▼
┌─────────────────┐
│ Fetch/Use input │ ← DocumentProvider (for URIs only)
└────────┬────────┘
         │ raw bytes
         ▼
┌─────────────────┐
│ Content Regular-│ ← Extract text from HTML/PDF/DOCX/PPTX
│ ization         │   (scripts/styles removed)
└────────┬────────┘
         │ clean text (+ OCR page list if scanned)
         ▼
┌─────────────────┐
│ Media Enrichment│ ← Optional: vision description (images)
│ (if configured) │   or transcription (audio) appended
└────────┬────────┘
         │ enriched text
    ┌────┴────┬─────────────┐
    │         │             │
    ▼         ▼             ▼
  embed()  summarize()   tags (from args)
    │         │             │
    └────┬────┴─────────────┘
         │
    ┌────┴────────────────┐
    │                     │
    ▼                     ▼
┌─────────────────┐  ┌─────────────────┐
│ DocumentStore   │  │ VectorStore     │
│ upsert()        │  │ upsert()        │
│ - summary       │  │ - embedding     │
│ - tags          │  │ - summary       │
│ - timestamps    │  │ - tags          │
│ - content hash  │  │ - version embed │
│ - archive prev  │  │                 │
└─────────────────┘  └─────────────────┘
         │
         ▼ (if scanned PDF or image)
┌─────────────────────────────────┐
│ Background OCR (keep pending)   │
│ Placeholder stored immediately; │
│ OCR text replaces it + re-embeds│
└─────────────────────────────────┘

Versioning on update:

Embedding dedup:

Retrieval: find(query)

query text
    │
    ▼
  embed()  ← EmbeddingProvider
    │
    │ query vector
    ▼
┌───────────────────┐
│ VectorStore       │
│ query_embedding() │ ← cosine similarity search
└─────────┬─────────┘
          │
          ▼ results with distance scores
    ┌──────────────┐
    │ Apply decay  │ ← Recency weighting (ACT-R style)
    │ score × 0.5^(days/half_life)
    └──────┬───────┘
           │
           ▼
    ┌──────────────┐
    │ Date filter  │ ← Optional --since / --until
    └──────┬───────┘
           │
           ▼
    list[Item] (sorted by effective score)

Delete / Revert: delete(id) or revert(id)

delete(id)
    │
    ▼
  version_count(id)
    │
    ├── 0 versions → full delete from both stores
    │
    └── N versions → revert to previous
            │
            ├─ get archived embedding from VectorStore (id@vN)
            ├─ restore_latest_version() in DocumentStore
            │    (promote latest version row to current, delete version row)
            ├─ upsert restored embedding as current in VectorStore
            └─ delete versioned entry (id@vN) from VectorStore

Key Design Decisions

1. Schema as Data

2. Lazy Provider Loading

3. Separation of Concerns

4. No Original Content Storage

5. Immutable Items

6. System Tag Protection

7. Document Versioning

8. Version-Based Addressing


Storage Layout

store_path/
├── keep.toml               # Provider configuration
├── chroma/                 # ChromaDB persistence (vectors + metadata)
│   └── [collection]/       # One collection = one namespace
│       ├── embeddings
│       ├── metadata
│       └── documents
├── document_store.db       # SQLite store (summaries, tags, versions, parts)
│   ├── documents           # Current version of each document
│   ├── document_versions   # Archived previous versions
│   └── parts               # Structural decomposition (from analyze)
└── embedding_cache.db      # SQLite cache for embeddings

Provider Types

Embedding Providers

Generate vector representations for semantic search.

Dimension determined by model. Must be consistent across indexing and queries.

Summarization Providers

Generate human-readable summaries from content.

Contextual Summarization:

When documents have user tags (domain, topic, project, etc.), the summarizer receives context from related items. This produces summaries that highlight relevance to the tagged context rather than generic descriptions.

How it works:

  1. When processing pending summaries, the system checks for user tags
  2. Finds similar items that share any of those tags (OR-union)
  3. Boosts scores for items sharing multiple tags (+20% per additional match)
  4. Top 5 related summaries are passed as context to the LLM
  5. The summary reflects what's relevant to that context

Example: Indexing a medieval text with domain=practice produces a summary highlighting its relevance to contemplative practice, not just "a 13th-century guide for anchoresses."

Tag changes trigger re-summarization: When user tags are added, removed, or changed on an existing document, it's re-queued for contextual summarization even if content is unchanged. The existing summary is preserved until the new one is ready.

Non-LLM providers (truncate, first_paragraph, passthrough) ignore context.

Document Providers

Fetch content from URIs with content regularization.

Content Regularization:

Provider-extracted tags merge with user tags (user wins on collision). This ensures both embedding and summarization receive clean text.

Content Extractor / OCR Providers

Extract text from scanned PDFs and images via optical character recognition.

OCR runs in the background via the pending queue (keep pending), not during put(). The flow:

  1. During put(), content regularization detects scanned PDF pages (no extractable text) or image files
  2. A placeholder is stored immediately so the item is indexed right away
  3. The pages/image are enqueued for background OCR processing
  4. keep pending picks up the OCR task, renders pages to images, runs OCR, cleans and scores the text
  5. The full OCR text replaces the placeholder and the item is re-embedded

Design points:

Media Description Providers (optional)

Generate text descriptions from media files, enriching metadata-only content.

Media description runs in Keeper.put() between fetch and upsert. Descriptions are appended to the metadata content before embedding/summarization, making media files semantically searchable by their visual or audio content.

Design points:


LangChain / LangGraph Integration

The keep.langchain module provides framework adapters on top of the API layer:

┌─────────────────────────────────────────────────────────────┐
│  LangChain Layer (keep/langchain/)                          │
│  - KeepStore         LangGraph BaseStore adapter            │
│  - KeepNotesToolkit  4 LangChain tools                     │
│  - KeepNotesRetriever  BaseRetriever with now-context       │
│  - KeepNotesMiddleware  LCEL runnable for auto-injection    │
└──────────────────┬──────────────────────────────────────────┘
                   │ uses Keeper API
                   ▼
┌─────────────────────────────────────────────────────────────┐
│  API Layer (api.py)                                         │

KeepStore maps LangGraph's namespace/key model to Keep's tag system via configurable namespace_keys. Namespace components become regular Keep tags, visible to CLI and all query methods. Tag filtering is a pre-filter on the vector search, making tags suitable for data isolation (per-user, per-project). See LANGCHAIN-INTEGRATION.md.


Extension Points

New Embedding or Summarization Provider

  1. Implement the provider protocol (EmbeddingProvider or SummarizationProvider)
  2. Register in the config registry
  3. Reference by name in keep.toml

New Store Backend

Framework Integration