Files
chatgpt-on-wechat/docs/en/memory/index.mdx
zhayujie 33cf1bc4c3 feat(memory): async LLM context summary injection on trim
- Unified flush + context injection into a single async LLM call
  (flush_from_messages accepts context_summary_callback)
- Fixed response parsing bug: handle generator returns and Claude-format
  dicts from bot.call_with_tools, which previously caused all LLM
  summaries to silently fail (falling back to rule-based extraction)
- Removed standalone context summary prompts and methods; reuse the
  existing [DAILY]/[MEMORY] summarization pipeline
- Updated docs (zh/en/ja) to reflect the new injection behavior
2026-04-13 20:13:05 +08:00

59 lines
3.3 KiB
Plaintext

---
title: Long-term Memory
description: CowAgent long-term memory system — file persistence, automatic writing, and hybrid retrieval
---
Long-term memory is stored in workspace files, persisting across sessions. The Agent loads historical memory on demand via retrieval tools during conversation, and automatically writes conversation summaries to long-term memory when context is trimmed.
## Memory Types
### Core Memory (MEMORY.md)
Stored in `~/cow/MEMORY.md`, containing long-term user preferences, important decisions, key facts, and other information that doesn't fade over time. The Agent reads and writes this file via tools to maintain long-term knowledge.
### Daily Memory (memory/YYYY-MM-DD.md)
Stored in `~/cow/memory/` directory, named by date (e.g., `2026-03-08.md`), recording daily conversation summaries and key events. Files are only created on first write to avoid generating empty files.
## Automatic Writing
The Agent automatically persists conversation content to long-term memory through the following mechanisms:
- **On context trimming** — When conversation turns or tokens exceed the configured limit, the oldest half of the context is trimmed, and the discarded content is summarized by LLM into key information and written to the daily memory file. The summary is also asynchronously injected into the retained context for conversational continuity
- **Daily scheduled summary** — A full summary is automatically triggered at 23:55 every day, ensuring memory is preserved even on low-activity days (skipped if content hasn't changed)
- **On API context overflow** — When the model API returns a context overflow error, the current conversation summary is saved as an emergency measure
All memory writes run asynchronously in a background thread (LLM summarization + file writing), never blocking normal conversation replies.
## Memory Retrieval
The memory system supports hybrid retrieval modes:
- **Keyword retrieval** — FTS5 full-text index matching with BM25 ranking
- **Vector retrieval** — Embedding-based semantic similarity search, finds relevant memory even with different wording
The Agent automatically triggers memory retrieval during conversation as needed, incorporating relevant historical information into context. Results are ranked by a combined score (default: 0.7 vector weight + 0.3 keyword weight). Daily memory scores decay over time (30-day half-life), while core memory does not decay.
## First Launch
On first launch, the Agent will proactively ask the user for key information and save it to the workspace (default `~/cow`):
| File | Description |
| --- | --- |
| `system.md` | Agent system prompt and behavior settings |
| `user.md` | User identity information and preferences |
| `MEMORY.md` | Core memory (long-term) |
| `memory/YYYY-MM-DD.md` | Daily memory (created on demand) |
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## Configuration
| Parameter | Description | Default |
| --- | --- | --- |
| `agent_workspace` | Workspace path, memory files stored under this directory | `~/cow` |
| `agent_max_context_tokens` | Max context tokens; when exceeded, content is trimmed and summarized into memory | `50000` |
| `agent_max_context_turns` | Max context turns; when exceeded, content is trimmed and summarized into memory | `20` |