docs: add self-evolution documentation and fix SOUL.md persistence

- Create 01-identity-evolution.md: Identity system architecture (SOUL.md, USER.md, change proposals, version management) - Create 04-heartbeat-engine.md: Proactive behavior system (heartbeat config, alerts, proactivity levels) - Create 06-context-compaction.md: Context compression system (token management, summarization, information retention) - Update ZCLAW_AGENT_INTELLIGENCE_EVOLUTION.md: Add Phase 5 self-evolution UX roadmap - Fix AgentOnboardingWizard: Persist SOUL.md and USER.md after agent creation - Fix llm-service: Add Tauri kernel mode detection for memory system LLM calls - Fix kernel: Kernel config takes priority over agent's persisted model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 00:38:31 +08:00
parent a389082dd4
commit 6c64d704d7
5 changed files with 982 additions and 3 deletions
--- a/docs/features/02-intelligence-layer/06-context-compaction.md
+++ b/docs/features/02-intelligence-layer/06-context-compaction.md
@@ -0,0 +1,383 @@
+# 上下文压缩系统 (Context Compaction)
+
+> **成熟度**: L4 - 生产
+> **最后更新**: 2026-03-24
+> **负责人**: Intelligence Layer Team
+
+## 概述
+
+上下文压缩系统解决了无限对话长度的核心挑战：
+1. **Token 限制管理** - 监控对话长度，防止超出模型限制
+2. **智能摘要** - 将历史对话压缩为简洁摘要
+3. **信息保留** - 确保关键决策、偏好、上下文不丢失
+4. **无感知压缩** - 用户无需手动管理对话历史
+
+---
+
+## 核心概念
+
+### 压缩配置 (CompactionConfig)
+
+```typescript
+interface CompactionConfig {
+  soft_threshold_tokens: number;      // 软阈值（触发压缩建议）
+  hard_threshold_tokens: number;      // 硬阈值（强制压缩）
+  reserve_tokens: number;             // 为响应预留的 token
+  memory_flush_enabled: boolean;      // 是否在压缩前刷新记忆
+  keep_recent_messages: number;       // 保留的最近消息数
+  summary_max_tokens: number;         // 摘要最大 token 数
+  use_llm: boolean;                   // 是否使用 LLM 生成摘要
+  llm_fallback_to_rules: boolean;     // LLM 失败时回退到规则
+}
+```
+
+### 压缩检查 (CompactionCheck)
+
+```typescript
+interface CompactionCheck {
+  should_compact: boolean;            // 是否需要压缩
+  current_tokens: number;             // 当前 token 数
+  threshold: number;                  // 触发阈值
+  urgency: 'none' | 'soft' | 'hard';  // 紧急程度
+}
+```
+
+### 压缩结果 (CompactionResult)
+
+```typescript
+interface CompactionResult {
+  compacted_messages: CompactableMessage[];  // 压缩后的消息列表
+  summary: string;                           // 生成的摘要
+  original_count: number;                    // 原始消息数
+  retained_count: number;                    // 保留消息数
+  flushed_memories: number;                  // 刷新的记忆数
+  tokens_before_compaction: number;          // 压缩前 token
+  tokens_after_compaction: number;           // 压缩后 token
+}
+```
+
+---
+
+## 压缩流程
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                   Context Compaction                     │
+├─────────────────────────────────────────────────────────┤
+│                                                          │
+│   ┌──────────────┐                                      │
+│   │ 新消息到达   │                                      │
+│   └──────┬───────┘                                      │
+│          │                                              │
+│          ▼                                              │
+│   ┌──────────────┐     soft_threshold      ┌─────────┐ │
+│   │ Token 计算   │─────────────────────────▶│ 建议压缩 │ │
+│   └──────┬───────┘                          └─────────┘ │
+│          │                                              │
+│          │ hard_threshold                               │
+│          ▼                                              │
+│   ┌──────────────┐                                      │
+│   │ 强制压缩     │                                      │
+│   └──────┬───────┘                                      │
+│          │                                              │
+│          ▼                                              │
+│   ┌──────────────────────────────────────────────┐     │
+│   │ 1. 保留最近 N 条消息                          │     │
+│   │ 2. 对旧消息生成摘要                           │     │
+│   │ 3. 可选：提取记忆到 Memory Store             │     │
+│   │ 4. 替换旧消息为摘要                           │     │
+│   └──────────────────────────────────────────────┘     │
+│          │                                              │
+│          ▼                                              │
+│   ┌──────────────┐                                      │
+│   │ 压缩完成     │                                      │
+│   └──────────────┘                                      │
+│                                                          │
+└─────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Token 估算算法
+
+### CJK + 英文混合估算
+
+```rust
+// Rust 实现 (compactor.rs)
+pub fn estimate_tokens(text: &str) -> usize {
+    let mut tokens: f64 = 0.0;
+    for char in text.chars() {
+        let code = char as u32;
+        if code >= 0x4E00 && code <= 0x9FFF {
+            // CJK 基本汉字 → 1.5 tokens
+            tokens += 1.5;
+        } else if code >= 0x3400 && code <= 0x4DBF {
+            // CJK 扩展 A → 1.5 tokens
+            tokens += 1.5;
+        } else if code >= 0x3000 && code <= 0x303F {
+            // CJK 标点 → 1.0 token
+            tokens += 1.0;
+        } else if char == ' ' || char == '\n' || char == '\t' {
+            // 空白字符 → 0.25 token
+            tokens += 0.25;
+        } else {
+            // ASCII 字符 → ~0.3 token (4 chars ≈ 1 token)
+            tokens += 0.3;
+        }
+    }
+    tokens.ceil() as usize
+}
+```
+
+**设计原则**：宁可高估，不可低估。高估会提前触发压缩，但不会导致 API 错误。
+
+---
+
+## 摘要生成
+
+### 规则摘要（当前实现）
+
+```rust
+fn generate_summary(&self, messages: &[CompactableMessage]) -> String {
+    let mut sections: Vec<String> = vec!["[以下是之前对话的摘要]".to_string()];
+
+    // 1. 提取讨论主题
+    let topics = extract_topics(user_messages);
+    sections.push(format!("讨论主题: {}", topics.join("; ")));
+
+    // 2. 提取关键结论
+    let conclusions = extract_conclusions(assistant_messages);
+    sections.push(format!("关键结论:\n- {}", conclusions.join("\n- ")));
+
+    // 3. 提取技术上下文（代码片段等）
+    let tech_context = extract_technical_context(messages);
+    sections.push(format!("技术上下文: {}", tech_context.join(", ")));
+
+    // 4. 统计信息
+    sections.push(format!("(已压缩 {} 条消息)", messages.len()));
+
+    sections.join("\n")
+}
+```
+
+### 摘要示例
+
+```
+[以下是之前对话的摘要]
+讨论主题: 如何在 Rust 中实现异步 HTTP 服务器; 性能优化建议
+关键结论:
+- 使用 tokio::run 作为异步运行时
+- 考虑使用连接池减少开销
+- 建议启用 HTTP/2 支持提升性能
+技术上下文: 代码片段 (rust), 代码片段 (toml)
+(已压缩 24 条消息，其中用户 12 条，助手 12 条)
+```
+
+---
+
+## 技术实现
+
+### 核心文件
+
+| 文件 | 用途 |
+|------|------|
+| `desktop/src-tauri/src/intelligence/compactor.rs` | Rust 压缩核心实现 |
+| `desktop/src/lib/intelligence-backend.ts` | TypeScript API 封装 |
+| `desktop/src/domains/intelligence/store.ts` | 状态管理 |
+
+### Tauri Commands
+
+```rust
+#[tauri::command]
+pub fn compactor_estimate_tokens(text: String) -> usize;
+
+#[tauri::command]
+pub fn compactor_estimate_messages_tokens(messages: Vec<CompactableMessage>) -> usize;
+
+#[tauri::command]
+pub fn compactor_check_threshold(
+    messages: Vec<CompactableMessage>,
+    config: Option<CompactionConfig>,
+) -> CompactionCheck;
+
+#[tauri::command]
+pub fn compactor_compact(
+    messages: Vec<CompactableMessage>,
+    agent_id: String,
+    conversation_id: Option<String>,
+    config: Option<CompactionConfig>,
+) -> CompactionResult;
+```
+
+### 前端 API
+
+```typescript
+// intelligence-backend.ts
+export const compactor = {
+  estimateTokens(text: string): Promise<number>;
+  estimateMessagesTokens(messages: CompactableMessage[]): Promise<number>;
+  checkThreshold(messages: CompactableMessage[], config?: CompactionConfig): Promise<CompactionCheck>;
+  compact(messages: CompactableMessage[], agentId: string, conversationId?: string, config?: CompactionConfig): Promise<CompactionResult>;
+};
+```
+
+---
+
+## 使用场景
+
+### 场景 1：自动压缩
+
+```typescript
+// 在发送消息前检查
+const check = await intelligence.compactor.checkThreshold(messages);
+
+if (check.urgency === 'hard') {
+  // 强制压缩
+  const result = await intelligence.compactor.compact(messages, agentId);
+  setMessages(result.compacted_messages);
+  console.log(`压缩完成: ${result.tokens_before_compaction} → ${result.tokens_after_compaction} tokens`);
+} else if (check.urgency === 'soft') {
+  // 建议用户压缩或等待
+  showCompactionSuggestion();
+}
+```
+
+### 场景 2：手动压缩
+
+```typescript
+// 用户主动触发压缩
+const result = await intelligence.compactor.compact(
+  messages,
+  agentId,
+  conversationId,
+  {
+    soft_threshold_tokens: 12000,
+    keep_recent_messages: 10,
+  }
+);
+```
+
+### 场景 3：压缩 + 记忆提取
+
+```typescript
+// 压缩前先提取记忆
+if (config.memory_flush_enabled) {
+  const memories = await extractMemoriesFromOldMessages(oldMessages);
+  for (const memory of memories) {
+    await intelligence.memory.store(memory);
+  }
+}
+
+// 然后执行压缩
+const result = await intelligence.compactor.compact(messages, agentId);
+```
+
+---
+
+## 与其他组件的集成
+
+```
+┌─────────────────────────────────────────────────────┐
+│                Context Compactor                     │
+├─────────────────────────────────────────────────────┤
+│                                                      │
+│   ┌──────────────┐     ┌──────────────┐             │
+│   │ ChatStore    │────▶│ Token 检查   │             │
+│   └──────────────┘     └──────────────┘             │
+│          │                    │                      │
+│          │                    ▼                      │
+│   ┌──────────────┐     ┌──────────────┐             │
+│   │ Memory Store │◀────│ 记忆提取     │             │
+│   └──────────────┘     └──────────────┘             │
+│          │                                           │
+│          │                                           │
+│          ▼                                           │
+│   ┌──────────────────────────────────────────────┐  │
+│   │           摘要生成                            │  │
+│   │   - 规则提取（当前）                          │  │
+│   │   - LLM 摘要（可选）                          │  │
+│   └──────────────────────────────────────────────┘  │
+│          │                                           │
+│          ▼                                           │
+│   ┌──────────────┐                                   │
+│   │ 压缩后消息   │                                   │
+│   └──────────────┘                                   │
+│                                                      │
+└─────────────────────────────────────────────────────┘
+```
+
+---
+
+## 配置示例
+
+### 开发模式（频繁压缩测试）
+
+```typescript
+{
+  soft_threshold_tokens: 5000,
+  hard_threshold_tokens: 8000,
+  reserve_tokens: 2000,
+  memory_flush_enabled: true,
+  keep_recent_messages: 4,
+  summary_max_tokens: 400,
+  use_llm: false,
+  llm_fallback_to_rules: true,
+}
+```
+
+### 生产模式（较长上下文）
+
+```typescript
+{
+  soft_threshold_tokens: 15000,
+  hard_threshold_tokens: 20000,
+  reserve_tokens: 4000,
+  memory_flush_enabled: true,
+  keep_recent_messages: 6,
+  summary_max_tokens: 800,
+  use_llm: false,
+  llm_fallback_to_rules: true,
+}
+```
+
+### 大上下文模式（32K 模型）
+
+```typescript
+{
+  soft_threshold_tokens: 25000,
+  hard_threshold_tokens: 30000,
+  reserve_tokens: 6000,
+  memory_flush_enabled: true,
+  keep_recent_messages: 10,
+  summary_max_tokens: 1200,
+  use_llm: true,  // 启用 LLM 摘要
+  llm_fallback_to_rules: true,
+}
+```
+
+---
+
+## 限制与未来改进
+
+### 当前限制
+
+1. **规则摘要质量有限** - 无法理解复杂语义，可能丢失重要细节
+2. **无增量压缩** - 每次都重新处理所有旧消息
+3. **无压缩预览** - 用户无法在压缩前预览摘要内容
+4. **LLM 摘要未实现** - `use_llm: true` 配置存在但未实际使用
+
+### 未来改进
+
+1. **LLM 增强摘要** - 使用轻量模型生成高质量摘要
+2. **增量压缩** - 只处理新增的消息，保留之前的摘要
+3. **压缩预览** - 显示摘要内容，允许用户编辑
+4. **智能保留** - 基于重要性的消息保留策略
+5. **压缩历史** - 保存压缩记录，支持回溯
+
+---
+
+## 相关文档
+
+- [00-agent-memory.md](./00-agent-memory.md) - 记忆系统
+- [03-reflection-engine.md](./03-reflection-engine.md) - 反思引擎
+- [04-heartbeat-engine.md](./04-heartbeat-engine.md) - 心跳巡检