mirror of
https://github.com/zhayujie/chatgpt-on-wechat.git
synced 2026-06-02 00:57:41 +08:00
- Unified flush + context injection into a single async LLM call (flush_from_messages accepts context_summary_callback) - Fixed response parsing bug: handle generator returns and Claude-format dicts from bot.call_with_tools, which previously caused all LLM summaries to silently fail (falling back to rule-based extraction) - Removed standalone context summary prompts and methods; reuse the existing [DAILY]/[MEMORY] summarization pipeline - Updated docs (zh/en/ja) to reflect the new injection behavior
82 lines
3.6 KiB
Plaintext
82 lines
3.6 KiB
Plaintext
---
|
|
title: Short-term Memory
|
|
description: Conversation context — message management, compression strategies, and context operations
|
|
---
|
|
|
|
Conversation context is the Agent's short-term memory, containing all messages in the current session (user input, Agent replies, tool calls and results). Proper context management is critical for the Agent's reasoning quality and cost control.
|
|
|
|
## Context Structure
|
|
|
|
Each conversation turn consists of:
|
|
|
|
```
|
|
User message → Agent thinking → Tool call → Tool result → ... → Agent final reply
|
|
```
|
|
|
|
A single turn may include multiple tool calls (controlled by `agent_max_steps`). All tool calls and results are retained in context until compressed or trimmed.
|
|
|
|
## Key Configuration
|
|
|
|
| Parameter | Description | Default |
|
|
| --- | --- | --- |
|
|
| `agent_max_context_tokens` | Maximum context token budget | `50000` |
|
|
| `agent_max_context_turns` | Maximum conversation turns in context | `20` |
|
|
| `agent_max_steps` | Maximum decision steps per turn (tool call count) | `15` |
|
|
|
|
Configurable via `config.json` or the `/config` chat command.
|
|
|
|
## Compression Strategy
|
|
|
|
When context exceeds limits, the system automatically compresses to free space. The process has multiple stages:
|
|
|
|
### 1. Tool Result Truncation
|
|
|
|
Before each decision loop, the system checks tool call results in historical turns. Results exceeding **20,000 characters** are truncated, keeping only the beginning and end with a truncation notice. Current turn results are not affected.
|
|
|
|
### 2. Turn Trimming
|
|
|
|
When conversation turns exceed `agent_max_context_turns`:
|
|
|
|
- The **oldest half** of complete turns is trimmed (preserving tool call chain integrity)
|
|
- Trimmed messages are summarized by LLM and **written to the daily memory file**
|
|
- Once the LLM summary is ready, it is also **injected into the first user message** of the retained context, helping the model maintain conversational continuity
|
|
- Summary injection runs asynchronously in the background and takes effect from the next turn onward
|
|
|
|
### 3. Token Budget Trimming
|
|
|
|
After turn trimming, if tokens still exceed the budget:
|
|
|
|
- **Fewer than 5 turns**: All turns undergo **text compression** — each turn keeps only the first user text and last Agent reply, removing intermediate tool call chains
|
|
- **5 or more turns**: The **first half** of turns is trimmed again, with discarded content written to memory and a context summary injected
|
|
|
|
### 4. Overflow Emergency Handling
|
|
|
|
When the model API returns a context overflow error:
|
|
|
|
1. All current messages are summarized and written to memory
|
|
2. Aggressive trimming is applied (tool results limited to 10K chars, user text to 10K, max 5 turns)
|
|
3. If still overflowing, the entire conversation context is cleared
|
|
|
|
## Session Persistence
|
|
|
|
Conversation messages are persisted to a local database, automatically restored after service restart. Restore strategy:
|
|
|
|
- Restores the most recent **`max(3, max_context_turns / 6)`** turns
|
|
- Only retains each turn's **user text and Agent final reply**, not intermediate tool call chains
|
|
- Sessions older than **30 days** are automatically cleaned up
|
|
|
|
## Commands
|
|
|
|
Use these commands in chat to manage context:
|
|
|
|
| Command | Description |
|
|
| --- | --- |
|
|
| `/context` | View current context statistics (message count, role distribution, total characters) |
|
|
| `/context clear` | Clear current session context |
|
|
| `/config agent_max_context_tokens 80000` | Adjust context token budget |
|
|
| `/config agent_max_context_turns 30` | Adjust context turn limit |
|
|
|
|
<Tip>
|
|
After clearing context, the Agent "forgets" previous conversation content. Content that was already written to long-term memory can still be retrieved via memory search.
|
|
</Tip>
|