keep put

Add or update a document in the store.

Usage

Four input modes, auto-detected:

keep put "my note"                    # Text mode (inline content)
keep put file:///path/to/doc.pdf      # URI mode (fetch and index)
keep put https://example.com/page     # URI mode (web content)
keep put /path/to/folder/             # Directory mode (index all files)
keep put -                            # Stdin mode (explicit)
echo "piped content" | keep put       # Stdin mode (detected)

Directory mode indexes all regular files in the folder (non-recursive). Skips hidden files, symlinks, and subdirectories.

Options

Option	Description
`-t`, `--tag KEY=VALUE`	Tag as key=value (repeatable)
`-i`, `--id ID`	Custom document ID (auto-generated for text/stdin)
`--summary TEXT`	User-provided summary (skips auto-summarization)
`--suggest-tags`	Show tag suggestions from similar items
`--analyze`	Queue background analysis after put (skips if parts are already current)
`-s`, `--store PATH`	Override store directory

Text mode and content-addressed IDs

Text mode uses content-addressed IDs for automatic versioning:

keep put "my note"              # Creates %a1b2c3d4e5f6
keep put "my note" -t done      # Same ID, new version (tag change)
keep put "different note"       # Different ID (new document)

Same content = same ID = enables versioning through tag changes.

Smart summary behavior

Short content (under max_summary_length, default 1000 chars): stored verbatim as its own summary
Long content: truncated placeholder stored immediately, real summary generated in background by keep pending
--summary provided: used as-is, skips auto-summarization

The LLM prompt used for summarization is configurable. Create a .prompt/summarize/* document whose match rules target specific tags, and its ## Prompt section replaces the default summarization prompt for matching documents. See PROMPTS.md for details.

Tag suggestions

The --suggest-tags flag shows tags from similar existing items, helping maintain consistent tagging:

keep put "OAuth2 token handling" --suggest-tags
# Suggests: project=myapp (3 similar), topic=auth (5 similar)

Update behavior

When updating an existing document (same ID):

Summary: replaced with new summary
Tags: merged — existing tags preserved, new tags override on key collision
Version: previous version archived automatically

Contextual summarization

When you provide tags during indexing, the summarizer uses context from related items to produce more relevant summaries.

System finds similar items sharing your tags
Items with more matching tags rank higher (+20% score boost per tag)
Top related summaries are passed as context to the LLM
Summary highlights relevance to that context

Tag changes trigger re-summarization:

keep put doc.pdf                       # Generic summary
keep put doc.pdf -t topic=auth         # Re-queued for contextual summary

Supported formats

Format	Extensions	Content extracted	Auto-tags
Text	.md, .txt, .py, .js, .json, .yaml, ...	Full text	—
PDF	.pdf	Text from all pages; scanned pages OCR'd in background†	—
HTML	.html, .htm	Text (scripts/styles removed)	—
DOCX	.docx	Paragraphs + tables	`author`, `title`
PPTX	.pptx	Slides + notes	`author`, `title`
Audio	.mp3, .flac, .ogg, .wav, .aiff, .m4a, .wma	Structured metadata (+ transcription\*)	`artist`, `album`, `genre`, `year`, `title`
Images	.jpg, .png, .tiff, .webp	EXIF metadata + OCR text† (+ description\*)	`dimensions`, `camera`, `date`

\* When a media description provider is configured ([media] in keep.toml), images get vision-model descriptions and audio files get speech-to-text transcription, appended to the extracted metadata. See QUICKSTART.md for setup.

† OCR (optical character recognition): Scanned PDF pages (pages with no extractable text) and all image files are automatically queued for background OCR when an OCR provider is available. Keep auto-detects Ollama (using glm-ocr, pulled automatically on first use) or MLX (mlx-vlm on Apple Silicon). A placeholder is stored immediately so the item is indexed right away; the full OCR text replaces it once background processing completes via keep pending. No configuration needed — if Ollama is running, OCR just works.

Auto-extracted tags merge with user-provided tags. User tags win on collision:

keep put file:///path/to/song.mp3                    # Auto-tags: artist, album, genre, year
keep put file:///path/to/song.mp3 -t genre="Nu Jazz" # Overrides auto-extracted genre
keep put file:///path/to/photo.jpg -t topic=vacation  # Adds topic alongside auto camera/date

Indexing documents

Index important documents encountered during work:

keep put "https://docs.example.com/auth" -t topic=auth -t project=myapp
keep put "file:///path/to/design.pdf" -t type=reference -t topic=architecture