Commit Graph

10 Commits

Author SHA1 Message Date
yangluxin613
fd571ac539 fix(memory): address PR review — numpy/UPSERT soft deps + BM25 floor + BLOB dim
- numpy soft dependency: try/except import + _HAS_NUMPY flag; _encode_embedding
  and _decode_embedding fall back to struct.pack/unpack; search_vector falls back
  to pure-Python cosine loop — startup never fails without numpy reinstalled
- SQLite UPSERT guard: _HAS_UPSERT = sqlite_version_info >= (3,24,0); save_chunk
  and save_chunks_batch fall back to INSERT OR REPLACE on SQLite < 3.24 with a
  one-time startup warning about potential FTS rowid drift
- _bm25_rank_to_score floor: 0.3 + 0.69*(|rank|/(1+|rank|)) → always in [0.3, 0.99),
  prevents small-corpus matches scoring 0.0 and being filtered by min_score
- detect_index_dim BLOB-aware: check isinstance(raw, bytes) first and return
  len(raw)//4 before json.loads, so /memory status works after embedding format switch
- Comment: "CJK single-char" → "CJK tokens shorter than 3 characters"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 14:15:16 +08:00
yangluxin613
9b31f45481 fix(memory): _search_like ASCII query always returns empty
matched_count only counted cjk_words hits; pure ASCII queries had
cjk_words=[] so matched_count=0 and all SQL-matched rows were filtered
out. Change to count across all tokens (cjk_words + ascii_words) so
the LIKE fallback works correctly when FTS5 is unavailable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 09:02:07 +08:00
yangluxin613
bc9c1691f5 fix(memory): CJK keyword search + vector search optimization
- Add trigram FTS5 table for CJK/mixed-language search with BM25 ranking
- Fix three-step search routing: unicode61 (ASCII) → trigram (CJK/mixed) → LIKE fallback
- Fix _bm25_rank_to_score: abs(rank)/(1+abs(rank)) instead of max(0,rank)
- Fix INSERT OR REPLACE → UPSERT to preserve FTS5 content table rowid stability
- Fix FTS5 JOIN to use rowid instead of id column
- Fix _search_like: single-char CJK match, dynamic scoring, merged CJK+ASCII path
- Add numpy vectorized cosine similarity + BLOB embedding storage (6x smaller)
- Add _decode_embedding backward compat for legacy JSON embeddings
- Add threading.RLock for concurrent write safety
- Add _meta table to avoid trigram backfill re-running on every startup
- Activate EmbeddingCache in MemoryManager for session-level query deduplication
- Add numpy>=1.24 to requirements.txt
- Merge upstream master (embedding package refactor, FTS5 self-healing methods)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 08:56:08 +08:00
zhayujie
3ffb563a44 feat(memory): support multi-vendor embedding fallback
Add embedding_provider config knob with native support for
openai / dashscope / doubao / zhipu / linkai, plus an in-chat
/memory status and /memory rebuild-index workflow for switching
vendors safely.
2026-05-20 11:00:53 +08:00
haosenwang1018
adca89b973 fix: replace bare except clauses with except Exception
Bare `except:` catches BaseException including KeyboardInterrupt and
SystemExit. Replaced 29 instances with `except Exception:`.
2026-02-25 11:49:19 +00:00
saboteur7
73b069a76c docs: update 2.0 README.md 2026-02-03 12:19:36 +08:00
saboteur7
60abcd92a3 feat: update README.md and solving Python compatibility issues 2026-02-03 01:17:25 +08:00
zhayujie
6f70a8efda fix: fts5 not available bug 2026-02-01 17:08:02 +08:00
saboteur7
5a466d0ff6 fix: long-term memory bug 2026-01-30 11:31:13 +08:00
saboteur7
bb850bb6c5 feat: personal ai agent framework 2026-01-30 09:53:46 +08:00