Compare commits
19 Commits
00ebf18f23
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
7b0d452845 | ||
|
|
855c89e8fb | ||
|
|
3eb098f020 | ||
|
|
c12b64150b | ||
|
|
4c31471cd6 | ||
|
|
b60b96225d | ||
|
|
06e93a21af | ||
|
|
9060935401 | ||
|
|
6d6673bf5b | ||
|
|
15f84bf8c1 | ||
|
|
9a313e3c92 | ||
|
|
ee5611a2f8 | ||
|
|
5cf7adff69 | ||
|
|
10497362bb | ||
|
|
d7dbdf8600 | ||
|
|
8c25b20fe2 | ||
|
|
87110ffdff | ||
|
|
980a8135fa | ||
|
|
e9e7ffd609 |
178
CLAUDE.md
178
CLAUDE.md
@@ -165,10 +165,25 @@ desktop/src-tauri (→ kernel, skills, hands, protocols)
|
||||
2. **自动验证** — `cargo check` / `cargo test` / `tsc --noEmit` / `vitest run` 必须通过
|
||||
3. **回归测试** — 跑受影响 crate 的全量测试,确认无回归
|
||||
|
||||
#### 阶段 4: 提交 + 同步(立即,不积压)
|
||||
#### 阶段 4: Wiki 同步 + 提交(立即,不积压)
|
||||
|
||||
1. **提交推送** — 按 §11 规范提交,**立即 `git push`**
|
||||
2. **文档同步** — 按 §8.3 检查并更新相关文档,提交并推送
|
||||
**Wiki 同步评估(硬门槛,不可跳过)**
|
||||
|
||||
代码改完后、提交前,逐条回答以下问题。任何一条为"是"→ 必须更新对应 wiki 页面:
|
||||
|
||||
| 评估问题 | 为"是"时更新 |
|
||||
|----------|-------------|
|
||||
| 这个改动修复或引入了 bug? | 对应模块页"活跃问题+陷阱"节 + `wiki/known-issues.md` |
|
||||
| 这个改动改变了某个模块的行为或设计理由? | 对应模块页"设计决策"节 |
|
||||
| 这个改动增删了文件或改变了目录结构? | 对应模块页"关键文件"表 |
|
||||
| 这个改动影响了跨模块接口(谁调谁、参数形状、触发时机)? | 涉及双方的"集成契约"表 |
|
||||
| 这个改动涉及一个必须始终成立的约束? | 对应模块页"代码逻辑"节的 ⚡ 不变量 |
|
||||
| 这个改动改变了功能链路(前端→后端的完整路径)? | `wiki/feature-map.md` 索引表 |
|
||||
| 这个改动改变了关键数字(命令数/Store数/测试数等)? | `wiki/index.md` 关键数字表 + `docs/TRUTH.md` |
|
||||
|
||||
全部回答完后,无论是否有更新,都追加一条到 `wiki/log.md` + 更新模块页"变更记录"节(保持 5 条)。
|
||||
|
||||
**提交推送** — 按 §11 规范提交,**立即 `git push`**。详细文档同步规则见 §8.3。
|
||||
|
||||
**铁律:不允许"等一下再提交"或"最后一起推送"。每个独立工作单元完成后立即推送。**
|
||||
|
||||
@@ -374,34 +389,44 @@ docs/
|
||||
|
||||
每次完成功能实现、架构变更、问题修复后,**必须立即执行以下收尾**:
|
||||
|
||||
#### 步骤 A:文档同步(代码提交前)
|
||||
#### 步骤 A:Wiki 同步(最高优先,代码提交前)
|
||||
|
||||
检查以下文档是否需要更新,有变更则立即修改:
|
||||
> **为什么 wiki 排第一**:wiki 是新 AI 会话的启动燃料。如果 wiki 与代码不一致,后续所有会话都会基于错误上下文工作,错误会积累放大。
|
||||
|
||||
在 §3.3 阶段 4 的评估表基础上,执行具体更新:
|
||||
|
||||
| 触发事件 | 更新目标 | 更新内容 |
|
||||
|----------|---------|---------|
|
||||
| 修复 bug | 对应模块页"活跃问题+陷阱" | 修复→移除条目;新增→添加条目 |
|
||||
| 架构/设计变更 | 对应模块页"设计决策" | WHY 变了 + 新的权衡取舍 |
|
||||
| 文件增删/移动 | 对应模块页"关键文件"表 | 更新文件列表 |
|
||||
| 跨模块接口变化 | **涉及双方**的"集成契约"表 | 方向/接口/触发时机 |
|
||||
| 发现新的不变量 | 对应模块页"代码逻辑"节 | ⚡ 标记 + 一句话描述 |
|
||||
| 功能链路变化 | `wiki/feature-map.md` | 更新索引表对应行 |
|
||||
| 关键数字变化 | `wiki/index.md` + `docs/TRUTH.md` | 更新数字 + 验证命令 |
|
||||
| **每次收尾** | `wiki/log.md` + 模块页"变更记录" | 追加日志条目 + 变更记录保持 5 条 |
|
||||
|
||||
**wiki 更新原则**:
|
||||
- 只记录代码不能告诉你的东西(WHY、跨模块关系、不变量、历史教训)
|
||||
- 模块页控制在 100-200 行,超出则归档到 `wiki/archive/`
|
||||
- 同一信息只出现在一个页面(单一真相源),其他页面只引用
|
||||
|
||||
#### 步骤 B:其他文档同步
|
||||
|
||||
1. **CLAUDE.md** — 项目结构、技术栈、工作流程、命令变化时
|
||||
2. **CLAUDE.md §13 架构快照** — 涉及子系统变更时,更新 `<!-- ARCH-SNAPSHOT-START/END -->` 标记区域(可执行 `/sync-arch` 技能自动分析)
|
||||
2. **CLAUDE.md §13 架构快照** — 涉及子系统变更时(可执行 `/sync-arch` 技能自动分析)
|
||||
3. **docs/ARCHITECTURE_BRIEF.md** — 架构决策或关键组件变更时
|
||||
4. **docs/features/** — 功能状态变化时
|
||||
5. **docs/knowledge-base/** — 新的排查经验或配置说明
|
||||
6. **wiki/** — 编译后知识库维护(按触发规则更新对应页面,每页统一 5 节: 设计决策 / 关键文件+集成契约 / 代码逻辑 / 活跃问题+陷阱 / 变更记录):
|
||||
- 修复 bug → 更新对应模块页"活跃问题"节 + `wiki/known-issues.md` 索引
|
||||
- 架构变更 → 更新对应模块页"设计决策"节
|
||||
- 文件结构变化 → 更新对应模块页"关键文件"表
|
||||
- 跨模块接口变化 → 更新对应模块页"集成契约"表
|
||||
- 新增不变量发现 → 更新对应模块页"代码逻辑"节的 ⚡ 标记项
|
||||
- 功能链路变化 → 更新 `wiki/feature-map.md` 索引表
|
||||
- 数字变化 → 更新 `wiki/index.md` 关键数字表 + `docs/TRUTH.md`
|
||||
- 每次更新 → 在 `wiki/log.md` 追加一条记录 + 模块页"变更记录"节更新最近 5 条
|
||||
6. **docs/TRUTH.md** — 数字(命令数、Store 数、crates 数等)变化时
|
||||
|
||||
#### 步骤 B:提交(按逻辑分组)
|
||||
#### 步骤 C:提交(按逻辑分组)
|
||||
|
||||
```
|
||||
代码变更 → 一个或多个逻辑提交
|
||||
文档变更 → 独立提交(如果和代码分开更清晰)
|
||||
```
|
||||
|
||||
#### 步骤 C:推送(立即)
|
||||
#### 步骤 D:推送(立即)
|
||||
|
||||
```
|
||||
git push
|
||||
@@ -559,7 +584,7 @@ refactor(store): 统一 Store 数据获取方式
|
||||
***
|
||||
|
||||
<!-- ARCH-SNAPSHOT-START -->
|
||||
<!-- 此区域由 auto-sync 自动更新,请勿手动编辑。更新时间: 2026-04-15 -->
|
||||
<!-- 此区域由 auto-sync 自动更新,请勿手动编辑。更新时间: 2026-04-23 -->
|
||||
|
||||
## 13. 当前架构快照
|
||||
|
||||
@@ -567,51 +592,53 @@ refactor(store): 统一 Store 数据获取方式
|
||||
|
||||
| 子系统 | 状态 | 最新变更 |
|
||||
|--------|------|----------|
|
||||
| 管家模式 (Butler) | ✅ 活跃 | 04-12 行业配置4行业 + 跨会话连续性 + <butler-context> XML fencing |
|
||||
| Hermes 管线 | ✅ 活跃 | 04-12 触发信号持久化 + 经验行业维度 + 注入格式优化 |
|
||||
| 管家模式 (Butler) | ✅ 活跃 | 04-23 跨会话身份(soul.md) + 动态建议(4路并行LLM驱动) + Agent tab 移除 |
|
||||
| Hermes 管线 | ✅ 活跃 | 04-23 experience_find_relevant Tauri 命令 + ExperienceBrief + OnceLock 单例 |
|
||||
| Intelligence Heartbeat | ✅ 活跃 | 04-15 统一健康快照 (health_snapshot.rs) + HeartbeatManager 重构 + HealthPanel 前端 |
|
||||
| 聊天流 (ChatStream) | ✅ 稳定 | 04-02 ChatStore 拆分为 4 Store (stream/conversation/message/chat) |
|
||||
| 记忆管道 (Memory) | ✅ 稳定 | 04-17 E2E 验证: 存储+FTS5+TF-IDF+注入闭环,去重+跨会话注入已修复 |
|
||||
| 聊天流 (ChatStream) | ✅ 活跃 | 04-23 LLM 动态建议(替换硬编码) + 澄清卡片 UX 优化 |
|
||||
| 记忆管道 (Memory) | ✅ 活跃 | 04-23 身份信号提取(agent_name/user_name) + ProfileSignals 增强 |
|
||||
| SaaS 认证 (Auth) | ✅ 稳定 | Token池 RPM/TPM 轮换 + JWT password_version 失效机制 |
|
||||
| Pipeline DSL | ✅ 稳定 | 04-01 17 个 YAML 模板 + DAG 执行器 |
|
||||
| Hands 系统 | ✅ 稳定 | 7 注册 (6 HAND.toml + _reminder),Whiteboard/Slideshow/Speech 开发中 |
|
||||
| Pipeline DSL | ✅ 稳定 | 04-01 18 个 YAML 模板 + DAG 执行器 |
|
||||
| Hands 系统 | ✅ 稳定 | 7 注册 (6 HAND.toml + _reminder),Whiteboard/Slideshow/Speech 已删除 |
|
||||
| 技能系统 (Skills) | ✅ 稳定 | 75 个 SKILL.md + 语义路由 |
|
||||
| 中间件链 | ✅ 稳定 | 13 层 (ButlerRouter@80, Compaction@100, Memory@150, Title@180, SkillIndex@200, DanglingTool@300, ToolError@350, ToolOutputGuard@360, Guardrail@400, LoopGuard@500, SubagentLimit@550, TrajectoryRecorder@650, TokenCalibration@700) |
|
||||
| 中间件链 | ✅ 稳定 | 14 层 + 分波并行 (Evolution@78✅, ButlerRouter@80✅, Compaction@100, Memory@150✅, Title@180✅, SkillIndex@200✅, DanglingTool@300, ToolError@350, ToolOutputGuard@360, Guardrail@400, LoopGuard@500, SubagentLimit@550, TrajectoryRecorder@650, TokenCalibration@700) — ✅=parallel_safe |
|
||||
|
||||
### 关键架构模式
|
||||
|
||||
- **Hermes 管线**: 4模块闭环 — ExperienceStore(FTS5经验存取) + UserProfiler(结构化用户画像) + NlScheduleParser(中文时间→cron) + TrajectoryRecorder+Compressor(轨迹记录压缩)。通过中间件链+intelligence hooks调用
|
||||
- **管家模式**: 双模式UI (默认简洁/解锁专业) + ButlerRouter 动态行业关键词(4内置+自定义) + <butler-context> XML fencing注入 + 跨会话连续性(痛点回访+经验检索) + 触发信号持久化(VikingStorage) + 冷启动4阶段hook
|
||||
- **聊天流**: 3种实现 → GatewayClient(WebSocket) / KernelClient(Tauri Event) / SaaSRelay(SSE) + 5min超时守护。详见 [ARCHITECTURE_BRIEF.md](docs/ARCHITECTURE_BRIEF.md)
|
||||
- **管家模式**: 双模式UI (默认简洁/解锁专业) + ButlerRouter 动态行业关键词(4内置+自定义) + <butler-context> XML fencing注入 + 跨会话连续性(痛点回访+经验检索) + 触发信号持久化(VikingStorage) + 冷启动4阶段hook + 跨会话身份(soul.md) + 动态建议(4路并行LLM驱动2续问+1关怀)
|
||||
- **聊天流**: 3种实现 → GatewayClient(WebSocket) / KernelClient(Tauri Event) / SaaSRelay(SSE) + 5min超时守护。动态建议: prefetch context + generateLLMSuggestions(1追问+1行动+1关怀) 与 memory extraction 解耦。详见 [ARCHITECTURE_BRIEF.md](docs/ARCHITECTURE_BRIEF.md)
|
||||
- **客户端路由**: `getClient()` 4分支决策树 → Admin路由 / SaaS Relay(可降级到本地) / Local Kernel / External Gateway
|
||||
- **SaaS 认证**: JWT→OS keyring 存储 + HttpOnly cookie + Token池 RPM/TPM 限流轮换 + SaaS unreachable 自动降级
|
||||
- **记忆闭环**: 对话→extraction_adapter→FTS5全文+TF-IDF权重→检索→注入系统提示(E2E 04-17 验证通过,去重+跨会话注入已修复)
|
||||
- **记忆闭环**: 对话→extraction_adapter→FTS5全文+TF-IDF权重→检索→注入系统提示 + 身份信号提取(agent_name/user_name)→VikingStorage→soul.md→跨会话名字记忆
|
||||
- **LLM 驱动**: 4 Rust Driver (Anthropic/OpenAI/Gemini/Local) + 国内兼容 (DeepSeek/Qwen/Moonshot 通过 base_url)
|
||||
|
||||
### 最近变更
|
||||
|
||||
1. [04-21] Embedding 接通 + 自学习自动化 A线+B线: 记忆检索Embedding(GrowthIntegration→MemoryRetriever→SemanticScorer) + Skill路由Embedding+LLM Fallback(替换new_tf_idf_only) + evolution_bridge(SkillCandidate→SkillManifest) + generate_and_register_skill()全链路 + EvolutionMiddleware双模式(auto/suggest) + QualityGate加固(长度/标题/置信度上限)。验证: 934 tests PASS
|
||||
2. [04-21] Phase 0+1 突破之路 8 项基础链路修复: 经验积累覆盖修复(reuse_count累积) + Skill工具调用桥接(complete_with_tools) + Hand字段映射(runId) + Heartbeat痛点感知 + Browser委托消息 + 跨会话检索增强(IdentityRecall 26→43模式+弱身份fallback) + Twitter凭据持久化。验证: 912 tests PASS
|
||||
2. [04-17] 全系统 E2E 测试 129 链路: 82 PASS / 20 PARTIAL / 1 FAIL / 26 SKIP,有效通过率 79.1%。7 项 Bug 修复 (Dashboard 404/记忆去重/记忆注入/invoice_id/Prompt版本/agent隔离/行业字段)
|
||||
2. [04-16] 3 项 P0 修复 + 5 项 E2E Bug 修复 + Agent 面板刷新 + TRUTH.md 数字校准
|
||||
3. [04-15] Heartbeat 统一健康系统: health_snapshot.rs 统一收集器(LLM连接/记忆/会话/系统资源) + heartbeat.rs HeartbeatManager 重构 + HealthPanel.tsx 前端面板 + Tauri 命令 182→183 + intelligence 模块 15→16 文件 + 删除 intelligence-client/ 9 废弃文件
|
||||
4. [04-12] 行业配置+管家主动性 全栈 5 Phase: 行业数据模型+4内置配置+ButlerRouter动态关键词+触发信号+Tauri加载+Admin管理页面+跨会话连续性+XML fencing注入格式
|
||||
5. [04-09] Hermes Intelligence Pipeline 4 Chunk: ExperienceStore+Extractor, UserProfileStore+Profiler, NlScheduleParser, TrajectoryRecorder+Compressor (684 tests, 0 failed)
|
||||
6. [04-09] 管家模式6交付物完成: ButlerRouter + 冷启动 + 简洁模式UI + 桥测试 + 发布文档
|
||||
1. [04-23] 回复效率+建议生成并行化: identity prompt 缓存 + pre-hook 并行(tokio::join!) + middleware 分波并行(parallel_safe, 5层✅) + suggestion context 预取 + 建议与 memory 解耦 + prompt 重写(1追问+1行动+1关怀)
|
||||
2. [04-23] 动态建议智能化: fetchSuggestionContext 4路并行(用户画像/痛点/经验/技能匹配) + generateLLMSuggestions 混合型 prompt (2续问+1管家关怀) + experience_find_relevant Tauri 命令 + ExperienceBrief
|
||||
3. [04-23] 跨会话身份: detectAgentNameSuggestion trigger+extract 两步法(10 trigger) + ProfileSignals agent_name/user_name + soul.md 写回 + Agent tab 移除 (~280 行 dead code 清理)
|
||||
4. [04-22] Wiki 全面重构: 5节模板+集成契约+症状导航+归档压缩,净减 ~1,200 行
|
||||
4. [04-22] 跨会话记忆断裂修复 + DataMasking 中间件移除 + 搜索功能修复(多引擎+质量过滤+SSE行缓冲)
|
||||
5. [04-21] Embedding 接通 + 自学习自动化 A线+B线 + Phase 0+1 突破之路 8 项链路修复。验证: 934 tests PASS
|
||||
6. [04-20] 50 轮功能链路审计 7 项断链修复 (42/50 = 84% 通过率)
|
||||
7. [04-17] 全系统 E2E 测试 129 链路: 82 PASS / 20 PARTIAL / 1 FAIL / 26 SKIP,有效通过率 79.1%
|
||||
|
||||
<!-- ARCH-SNAPSHOT-END -->
|
||||
|
||||
<!-- ARCH-SNAPSHOT-END -->
|
||||
|
||||
<!-- ANTI-PATTERN-START -->
|
||||
<!-- 此区域由 auto-sync 自动更新,请勿手动编辑。更新时间: 2026-04-09 -->
|
||||
<!-- 此区域由 auto-sync 自动更新,请勿手动编辑。更新时间: 2026-04-23 -->
|
||||
|
||||
## 14. AI 协作注意事项
|
||||
|
||||
### 反模式警告
|
||||
|
||||
- ❌ **不要**建议新增 SaaS API 端点 — 已有 140 个,稳定化约束禁止新增
|
||||
- ❌ **不要**建议新增 SaaS API 端点 — 已有 137 个,稳定化约束禁止新增
|
||||
- ❌ **不要**忽略管家模式 — 已上线且为默认模式,所有聊天经过 ButlerRouter
|
||||
- ❌ **不要**假设 Tauri 直连 LLM — 实际通过 SaaS Token 池中转,SaaS unreachable 时降级到本地 Kernel
|
||||
- ❌ **不要**建议从零实现已有能力 — 先查 Hand(9个)/Skill(75个)/Pipeline(17模板) 现有库
|
||||
- ❌ **不要**建议从零实现已有能力 — 先查 Hand(7注册)/Skill(75个)/Pipeline(18模板) 现有库
|
||||
- ❌ **不要**在 CLAUDE.md 以外创建项目级配置或规则文件 — 单一入口原则
|
||||
|
||||
### 场景化指令
|
||||
@@ -620,6 +647,75 @@ refactor(store): 统一 Store 数据获取方式
|
||||
- 当遇到**认证相关** → 记住 Tauri 模式用 OS keyring 存 JWT,SaaS 模式用 HttpOnly cookie
|
||||
- 当遇到**新功能建议** → 先查 [TRUTH.md](docs/TRUTH.md) 确认可用能力清单,避免重复建设
|
||||
- 当遇到**记忆/上下文相关** → 记住闭环已接通: FTS5+TF-IDF+embedding,不是空壳
|
||||
- 当遇到**管家/Butler** → 管家模式是默认模式,ButlerRouter 在中间件链中做关键词分类+system prompt 增强
|
||||
- 当遇到**管家/Butler** → 管家模式是默认模式,ButlerRouter 在中间件链中做关键词分类+system prompt 增强。跨会话身份走 soul.md,动态建议走 4 路并行上下文+LLM
|
||||
|
||||
<!-- ANTI-PATTERN-END -->
|
||||
|
||||
***
|
||||
|
||||
## 15. Karpathy 编码原则
|
||||
|
||||
> 源自 Andrej Karpathy 对 LLM 编码问题的观察。偏向谨慎而非速度,简单任务可灵活判断。
|
||||
|
||||
### 15.1 Think Before Coding
|
||||
|
||||
**Don't assume. Don't hide confusion. Surface tradeoffs.**
|
||||
|
||||
- State assumptions explicitly. If uncertain, ask.
|
||||
- If multiple interpretations exist, present them — don't pick silently.
|
||||
- If a simpler approach exists, say so. Push back when warranted.
|
||||
- If something is unclear, stop. Name what's confusing. Ask.
|
||||
|
||||
### 15.2 Simplicity First
|
||||
|
||||
**Minimum code that solves the problem. Nothing speculative.**
|
||||
|
||||
- No features beyond what was asked.
|
||||
- No abstractions for single-use code.
|
||||
- No "flexibility" or "configurability" that wasn't requested.
|
||||
- No error handling for impossible scenarios.
|
||||
- If you write 200 lines and it could be 50, rewrite it.
|
||||
|
||||
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
|
||||
|
||||
### 15.3 Surgical Changes
|
||||
|
||||
**Touch only what you must. Clean up only your own mess.**
|
||||
|
||||
When editing existing code:
|
||||
|
||||
- Don't "improve" adjacent code, comments, or formatting.
|
||||
- Don't refactor things that aren't broken.
|
||||
- Match existing style, even if you'd do it differently.
|
||||
- If you notice unrelated dead code, mention it — don't delete it.
|
||||
|
||||
When your changes create orphans:
|
||||
|
||||
- Remove imports/variables/functions that YOUR changes made unused.
|
||||
- Don't remove pre-existing dead code unless asked.
|
||||
|
||||
The test: Every changed line should trace directly to the user's request.
|
||||
|
||||
### 15.4 Goal-Driven Execution
|
||||
|
||||
**Define success criteria. Loop until verified.**
|
||||
|
||||
Transform tasks into verifiable goals:
|
||||
|
||||
- "Add validation" → "Write tests for invalid inputs, then make them pass"
|
||||
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
|
||||
- "Refactor X" → "Ensure tests pass before and after"
|
||||
|
||||
For multi-step tasks, state a brief plan:
|
||||
|
||||
```
|
||||
1. [Step] → verify: [check]
|
||||
2. [Step] → verify: [check]
|
||||
3. [Step] → verify: [check]
|
||||
```
|
||||
|
||||
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
|
||||
|
||||
---
|
||||
|
||||
**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
|
||||
|
||||
@@ -117,7 +117,9 @@ impl Kernel {
|
||||
}
|
||||
}
|
||||
|
||||
use zclaw_runtime::{AgentLoop, tool::builtin::PathValidator};
|
||||
use std::sync::Arc;
|
||||
use zclaw_runtime::{AgentLoop, LlmDriver, tool::builtin::PathValidator};
|
||||
use zclaw_runtime::driver::{RetryDriver, RetryConfig};
|
||||
|
||||
use super::Kernel;
|
||||
use super::super::MessageResponse;
|
||||
@@ -161,9 +163,12 @@ impl Kernel {
|
||||
let subagent_enabled = chat_mode.as_ref().and_then(|m| m.subagent_enabled).unwrap_or(false);
|
||||
let tools = self.create_tool_registry(subagent_enabled);
|
||||
self.skill_executor.set_tool_registry(tools.clone());
|
||||
let driver: Arc<dyn LlmDriver> = Arc::new(
|
||||
RetryDriver::new(self.driver.clone(), RetryConfig::default())
|
||||
);
|
||||
let mut loop_runner = AgentLoop::new(
|
||||
*agent_id,
|
||||
self.driver.clone(),
|
||||
driver,
|
||||
tools,
|
||||
self.memory.clone(),
|
||||
)
|
||||
@@ -275,9 +280,12 @@ impl Kernel {
|
||||
let subagent_enabled = chat_mode.as_ref().and_then(|m| m.subagent_enabled).unwrap_or(false);
|
||||
let tools = self.create_tool_registry(subagent_enabled);
|
||||
self.skill_executor.set_tool_registry(tools.clone());
|
||||
let driver: Arc<dyn LlmDriver> = Arc::new(
|
||||
RetryDriver::new(self.driver.clone(), RetryConfig::default())
|
||||
);
|
||||
let mut loop_runner = AgentLoop::new(
|
||||
*agent_id,
|
||||
self.driver.clone(),
|
||||
driver,
|
||||
tools,
|
||||
self.memory.clone(),
|
||||
)
|
||||
@@ -426,6 +434,7 @@ impl Kernel {
|
||||
prompt.push_str("- Provide clear options when possible\n");
|
||||
prompt.push_str("- Include brief context about why you're asking\n");
|
||||
prompt.push_str("- After receiving clarification, proceed immediately\n");
|
||||
prompt.push_str("- CRITICAL: When calling ask_clarification, do NOT repeat the options in your text response. The options will be shown in a dedicated card above your reply. Simply greet the user and briefly explain why you need clarification — avoid phrases like \"以下信息\" or \"the following options\" that imply a list follows in your text\n");
|
||||
|
||||
prompt
|
||||
}
|
||||
|
||||
@@ -31,6 +31,8 @@ async fn seam_hand_tool_routing() {
|
||||
input_tokens: 10,
|
||||
output_tokens: 20,
|
||||
stop_reason: "tool_use".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
])
|
||||
// Second stream: final text after tool executes
|
||||
@@ -40,6 +42,8 @@ async fn seam_hand_tool_routing() {
|
||||
input_tokens: 10,
|
||||
output_tokens: 5,
|
||||
stop_reason: "end_turn".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
]);
|
||||
|
||||
@@ -105,6 +109,8 @@ async fn seam_hand_execution_callback() {
|
||||
input_tokens: 10,
|
||||
output_tokens: 5,
|
||||
stop_reason: "tool_use".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
])
|
||||
.with_stream_chunks(vec![
|
||||
@@ -113,6 +119,8 @@ async fn seam_hand_execution_callback() {
|
||||
input_tokens: 5,
|
||||
output_tokens: 1,
|
||||
stop_reason: "end_turn".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
]);
|
||||
|
||||
@@ -173,6 +181,8 @@ async fn seam_generic_tool_routing() {
|
||||
input_tokens: 10,
|
||||
output_tokens: 5,
|
||||
stop_reason: "tool_use".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
])
|
||||
.with_stream_chunks(vec![
|
||||
@@ -181,6 +191,8 @@ async fn seam_generic_tool_routing() {
|
||||
input_tokens: 5,
|
||||
output_tokens: 3,
|
||||
stop_reason: "end_turn".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
]);
|
||||
|
||||
|
||||
@@ -27,6 +27,8 @@ async fn smoke_hands_full_lifecycle() {
|
||||
input_tokens: 15,
|
||||
output_tokens: 10,
|
||||
stop_reason: "tool_use".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
])
|
||||
// After hand_quiz returns, LLM generates final response
|
||||
@@ -36,6 +38,8 @@ async fn smoke_hands_full_lifecycle() {
|
||||
input_tokens: 20,
|
||||
output_tokens: 5,
|
||||
stop_reason: "end_turn".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
]);
|
||||
|
||||
|
||||
@@ -14,6 +14,7 @@
|
||||
|
||||
use std::sync::Arc;
|
||||
use std::sync::atomic::{AtomicU64, Ordering};
|
||||
use serde_json::Value;
|
||||
use zclaw_types::{AgentId, Message, SessionId};
|
||||
|
||||
use crate::driver::{CompletionRequest, ContentBlock, LlmDriver};
|
||||
@@ -136,7 +137,7 @@ pub fn update_calibration(estimated: usize, actual: u32) {
|
||||
}
|
||||
|
||||
/// Estimate total tokens for messages with calibration applied.
|
||||
fn estimate_messages_tokens_calibrated(messages: &[Message]) -> usize {
|
||||
pub fn estimate_messages_tokens_calibrated(messages: &[Message]) -> usize {
|
||||
let raw = estimate_messages_tokens(messages);
|
||||
let factor = get_calibration_factor();
|
||||
if (factor - 1.0).abs() < f64::EPSILON {
|
||||
@@ -178,7 +179,7 @@ pub fn compact_messages(messages: Vec<Message>, keep_recent: usize) -> (Vec<Mess
|
||||
let old_messages = &messages[..split_index];
|
||||
let recent_messages = &messages[split_index..];
|
||||
|
||||
let summary = generate_summary(old_messages);
|
||||
let summary = generate_summary(old_messages, None);
|
||||
let removed_count = old_messages.len();
|
||||
|
||||
let mut compacted = Vec::with_capacity(1 + recent_messages.len());
|
||||
@@ -188,6 +189,38 @@ pub fn compact_messages(messages: Vec<Message>, keep_recent: usize) -> (Vec<Mess
|
||||
(compacted, removed_count)
|
||||
}
|
||||
|
||||
/// Prune old tool outputs to reduce token consumption. Runs before compaction.
|
||||
/// Only prunes ToolResult messages older than PRUNE_AGE_THRESHOLD messages.
|
||||
const PRUNE_AGE_THRESHOLD: usize = 8;
|
||||
const PRUNE_MAX_CHARS: usize = 2000;
|
||||
const PRUNE_KEEP_HEAD_CHARS: usize = 500;
|
||||
|
||||
pub fn prune_tool_outputs(messages: &mut [Message]) -> usize {
|
||||
let total = messages.len();
|
||||
let mut pruned_count = 0;
|
||||
|
||||
for i in 0..total.saturating_sub(PRUNE_AGE_THRESHOLD) {
|
||||
if let Message::ToolResult { output, is_error, .. } = &mut messages[i] {
|
||||
if *is_error { continue; }
|
||||
|
||||
let text = match output {
|
||||
Value::String(ref s) => s.clone(),
|
||||
ref other => other.to_string(),
|
||||
};
|
||||
if text.len() <= PRUNE_MAX_CHARS { continue; }
|
||||
|
||||
let end = text.floor_char_boundary(PRUNE_KEEP_HEAD_CHARS.min(text.len()));
|
||||
*output = serde_json::json!({
|
||||
"_pruned": true,
|
||||
"_original_chars": text.len(),
|
||||
"head": &text[..end],
|
||||
});
|
||||
pruned_count += 1;
|
||||
}
|
||||
}
|
||||
pruned_count
|
||||
}
|
||||
|
||||
/// Check if compaction should be triggered and perform it if needed.
|
||||
///
|
||||
/// Returns the (possibly compacted) message list.
|
||||
@@ -315,6 +348,18 @@ pub async fn maybe_compact_with_config(
|
||||
.iter()
|
||||
.take_while(|m| matches!(m, Message::System { .. }))
|
||||
.count();
|
||||
|
||||
// Extract previous summary from leading system messages for iterative summarization
|
||||
let previous_summary = messages.iter()
|
||||
.take(leading_system_count)
|
||||
.filter_map(|m| match m {
|
||||
Message::System { content } if content.starts_with("[以下是之前对话的摘要]") => {
|
||||
Some(content.clone())
|
||||
}
|
||||
_ => None,
|
||||
})
|
||||
.next();
|
||||
|
||||
let keep_from_end = DEFAULT_KEEP_RECENT
|
||||
.min(messages.len().saturating_sub(leading_system_count));
|
||||
let split_index = messages.len().saturating_sub(keep_from_end);
|
||||
@@ -333,14 +378,16 @@ pub async fn maybe_compact_with_config(
|
||||
let recent_messages = &messages[split_index..];
|
||||
let removed_count = old_messages.len();
|
||||
|
||||
// Step 3: Generate summary (LLM or rule-based)
|
||||
// Step 3: Generate summary (LLM or rule-based), with iterative context
|
||||
let prev_ref = previous_summary.as_deref();
|
||||
let summary = if config.use_llm {
|
||||
if let Some(driver) = driver {
|
||||
match generate_llm_summary(driver, old_messages, config.summary_max_tokens).await {
|
||||
match generate_llm_summary(driver, old_messages, prev_ref, config.summary_max_tokens).await {
|
||||
Ok(llm_summary) => {
|
||||
tracing::info!(
|
||||
"[Compaction] Generated LLM summary ({} chars)",
|
||||
llm_summary.len()
|
||||
"[Compaction] Generated LLM summary ({} chars, iterative={})",
|
||||
llm_summary.len(),
|
||||
previous_summary.is_some()
|
||||
);
|
||||
llm_summary
|
||||
}
|
||||
@@ -350,7 +397,7 @@ pub async fn maybe_compact_with_config(
|
||||
"[Compaction] LLM summary failed: {}, falling back to rules",
|
||||
e
|
||||
);
|
||||
generate_summary(old_messages)
|
||||
generate_summary(old_messages, prev_ref)
|
||||
} else {
|
||||
tracing::warn!(
|
||||
"[Compaction] LLM summary failed: {}, returning original messages",
|
||||
@@ -369,10 +416,10 @@ pub async fn maybe_compact_with_config(
|
||||
tracing::warn!(
|
||||
"[Compaction] LLM compaction requested but no driver available, using rules"
|
||||
);
|
||||
generate_summary(old_messages)
|
||||
generate_summary(old_messages, prev_ref)
|
||||
}
|
||||
} else {
|
||||
generate_summary(old_messages)
|
||||
generate_summary(old_messages, prev_ref)
|
||||
};
|
||||
|
||||
let used_llm = config.use_llm && driver.is_some();
|
||||
@@ -398,9 +445,11 @@ pub async fn maybe_compact_with_config(
|
||||
}
|
||||
|
||||
/// Generate a summary using an LLM driver.
|
||||
/// If `previous_summary` is provided, builds on it iteratively.
|
||||
async fn generate_llm_summary(
|
||||
driver: &Arc<dyn LlmDriver>,
|
||||
messages: &[Message],
|
||||
previous_summary: Option<&str>,
|
||||
max_tokens: u32,
|
||||
) -> Result<String, String> {
|
||||
let mut conversation_text = String::new();
|
||||
@@ -437,11 +486,21 @@ async fn generate_llm_summary(
|
||||
conversation_text.push_str("\n...(对话已截断)");
|
||||
}
|
||||
|
||||
let prompt = format!(
|
||||
"请用简洁的中文总结以下对话的关键信息。保留重要的讨论主题、决策、结论和待办事项。\
|
||||
输出格式为段落式摘要,不超过200字。\n\n{}",
|
||||
conversation_text
|
||||
);
|
||||
let prompt = match previous_summary {
|
||||
Some(prev) => format!(
|
||||
"你是一个对话摘要助手。\n\n\
|
||||
## 上一轮摘要\n{}\n\n\
|
||||
## 新增对话内容\n{}\n\n\
|
||||
请在上一轮摘要的基础上更新,保留所有关键决策、用户偏好和文件操作。\
|
||||
输出200字以内的中文摘要。",
|
||||
prev, conversation_text
|
||||
),
|
||||
None => format!(
|
||||
"请用简洁的中文总结以下对话的关键信息。保留重要的讨论主题、决策、结论和待办事项。\
|
||||
输出格式为段落式摘要,不超过200字。\n\n{}",
|
||||
conversation_text
|
||||
),
|
||||
};
|
||||
|
||||
let request = CompletionRequest {
|
||||
model: String::new(),
|
||||
@@ -484,13 +543,22 @@ async fn generate_llm_summary(
|
||||
}
|
||||
|
||||
/// Generate a rule-based summary of old messages.
|
||||
fn generate_summary(messages: &[Message]) -> String {
|
||||
/// If `previous_summary` is provided, carries forward key info.
|
||||
fn generate_summary(messages: &[Message], previous_summary: Option<&str>) -> String {
|
||||
if messages.is_empty() {
|
||||
return "[对话开始]".to_string();
|
||||
}
|
||||
|
||||
let mut sections: Vec<String> = vec!["[以下是之前对话的摘要]".to_string()];
|
||||
|
||||
// Carry forward previous summary if available
|
||||
if let Some(prev) = previous_summary {
|
||||
// Strip the header line from previous summary for cleaner nesting
|
||||
let prev_body = prev.strip_prefix("[以下是之前对话的摘要]\n")
|
||||
.unwrap_or(prev);
|
||||
sections.push(format!("[上轮摘要保留]: {}", truncate(prev_body, 200)));
|
||||
}
|
||||
|
||||
let mut user_count = 0;
|
||||
let mut assistant_count = 0;
|
||||
let mut topics: Vec<String> = Vec::new();
|
||||
@@ -696,8 +764,21 @@ mod tests {
|
||||
Message::user("How does ownership work?"),
|
||||
Message::assistant("Ownership is Rust's memory management system"),
|
||||
];
|
||||
let summary = generate_summary(&messages);
|
||||
let summary = generate_summary(&messages, None);
|
||||
assert!(summary.contains("摘要"));
|
||||
assert!(summary.contains("2"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_generate_summary_iterative() {
|
||||
let messages = vec![
|
||||
Message::user("What is async/await?"),
|
||||
Message::assistant("Async/await is a concurrency model"),
|
||||
];
|
||||
let prev = "[以下是之前对话的摘要]\n讨论主题: Rust; 所有权\n(已压缩 4 条消息)";
|
||||
let summary = generate_summary(&messages, Some(prev));
|
||||
assert!(summary.contains("摘要"));
|
||||
assert!(summary.contains("上轮摘要保留"));
|
||||
assert!(summary.contains("所有权"));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -121,6 +121,8 @@ impl LlmDriver for AnthropicDriver {
|
||||
let mut byte_stream = response.bytes_stream();
|
||||
let mut current_tool_id: Option<String> = None;
|
||||
let mut tool_input_buffer = String::new();
|
||||
let mut cache_creation_input_tokens: Option<u32> = None;
|
||||
let mut cache_read_input_tokens: Option<u32> = None;
|
||||
|
||||
while let Some(chunk_result) = byte_stream.next().await {
|
||||
let chunk = match chunk_result {
|
||||
@@ -141,6 +143,15 @@ impl LlmDriver for AnthropicDriver {
|
||||
match serde_json::from_str::<AnthropicStreamEvent>(data) {
|
||||
Ok(event) => {
|
||||
match event.event_type.as_str() {
|
||||
"message_start" => {
|
||||
// Capture cache token info from message_start event
|
||||
if let Some(msg) = event.message {
|
||||
if let Some(usage) = msg.usage {
|
||||
cache_creation_input_tokens = usage.cache_creation_input_tokens;
|
||||
cache_read_input_tokens = usage.cache_read_input_tokens;
|
||||
}
|
||||
}
|
||||
}
|
||||
"content_block_delta" => {
|
||||
if let Some(delta) = event.delta {
|
||||
if let Some(text) = delta.text {
|
||||
@@ -186,6 +197,8 @@ impl LlmDriver for AnthropicDriver {
|
||||
input_tokens: msg.usage.as_ref().map(|u| u.input_tokens).unwrap_or(0),
|
||||
output_tokens: msg.usage.as_ref().map(|u| u.output_tokens).unwrap_or(0),
|
||||
stop_reason: msg.stop_reason.unwrap_or_else(|| "end_turn".to_string()),
|
||||
cache_creation_input_tokens,
|
||||
cache_read_input_tokens,
|
||||
});
|
||||
}
|
||||
}
|
||||
@@ -298,7 +311,15 @@ impl AnthropicDriver {
|
||||
AnthropicRequest {
|
||||
model: request.model.clone(),
|
||||
max_tokens: effective_max,
|
||||
system: request.system.clone(),
|
||||
system: request.system.as_ref().map(|s| {
|
||||
vec![SystemContentBlock {
|
||||
r#type: "text".to_string(),
|
||||
text: s.clone(),
|
||||
cache_control: Some(CacheControl {
|
||||
r#type: "ephemeral".to_string(),
|
||||
}),
|
||||
}]
|
||||
}),
|
||||
messages,
|
||||
tools: if tools.is_empty() { None } else { Some(tools) },
|
||||
temperature: request.temperature,
|
||||
@@ -337,18 +358,35 @@ impl AnthropicDriver {
|
||||
input_tokens: api_response.usage.input_tokens,
|
||||
output_tokens: api_response.usage.output_tokens,
|
||||
stop_reason,
|
||||
cache_creation_input_tokens: api_response.usage.cache_creation_input_tokens,
|
||||
cache_read_input_tokens: api_response.usage.cache_read_input_tokens,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Anthropic API types
|
||||
|
||||
/// Anthropic cache_control 标记
|
||||
#[derive(Serialize, Clone)]
|
||||
struct CacheControl {
|
||||
r#type: String, // "ephemeral"
|
||||
}
|
||||
|
||||
/// Anthropic system prompt 内容块(支持 cache_control)
|
||||
#[derive(Serialize, Clone)]
|
||||
struct SystemContentBlock {
|
||||
r#type: String, // "text"
|
||||
text: String,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
cache_control: Option<CacheControl>,
|
||||
}
|
||||
|
||||
#[derive(Serialize)]
|
||||
struct AnthropicRequest {
|
||||
model: String,
|
||||
max_tokens: u32,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
system: Option<String>,
|
||||
system: Option<Vec<SystemContentBlock>>,
|
||||
messages: Vec<AnthropicMessage>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
tools: Option<Vec<AnthropicTool>>,
|
||||
@@ -404,6 +442,10 @@ struct AnthropicContentBlock {
|
||||
struct AnthropicUsage {
|
||||
input_tokens: u32,
|
||||
output_tokens: u32,
|
||||
#[serde(default)]
|
||||
cache_creation_input_tokens: Option<u32>,
|
||||
#[serde(default)]
|
||||
cache_read_input_tokens: Option<u32>,
|
||||
}
|
||||
|
||||
// Streaming types
|
||||
@@ -458,4 +500,8 @@ struct AnthropicStreamUsage {
|
||||
input_tokens: u32,
|
||||
#[serde(default)]
|
||||
output_tokens: u32,
|
||||
#[serde(default)]
|
||||
cache_creation_input_tokens: Option<u32>,
|
||||
#[serde(default)]
|
||||
cache_read_input_tokens: Option<u32>,
|
||||
}
|
||||
|
||||
139
crates/zclaw-runtime/src/driver/error_classifier.rs
Normal file
139
crates/zclaw-runtime/src/driver/error_classifier.rs
Normal file
@@ -0,0 +1,139 @@
|
||||
//! LLM 错误分类器。将 HTTP 状态码 + 错误体映射为 LlmErrorKind。
|
||||
|
||||
use std::time::Duration;
|
||||
use zclaw_types::{LlmErrorKind, ClassifiedLlmError};
|
||||
|
||||
/// 分类 LLM 错误
|
||||
pub fn classify_llm_error(
|
||||
provider: &str,
|
||||
status: u16,
|
||||
body: &str,
|
||||
is_timeout: bool,
|
||||
) -> ClassifiedLlmError {
|
||||
let _ = provider; // reserved for per-provider overrides
|
||||
|
||||
if is_timeout {
|
||||
return ClassifiedLlmError {
|
||||
kind: LlmErrorKind::Timeout,
|
||||
retryable: true,
|
||||
should_compress: false,
|
||||
should_rotate_credential: false,
|
||||
retry_after: None,
|
||||
message: "请求超时".to_string(),
|
||||
};
|
||||
}
|
||||
|
||||
match status {
|
||||
401 | 403 => ClassifiedLlmError {
|
||||
kind: LlmErrorKind::Auth,
|
||||
retryable: false,
|
||||
should_compress: false,
|
||||
should_rotate_credential: true,
|
||||
retry_after: None,
|
||||
message: "认证失败,请检查 API Key".to_string(),
|
||||
},
|
||||
402 => {
|
||||
let is_quota_transient = body.contains("retry")
|
||||
|| body.contains("limit")
|
||||
|| body.contains("usage");
|
||||
ClassifiedLlmError {
|
||||
kind: if is_quota_transient { LlmErrorKind::RateLimited } else { LlmErrorKind::BillingExhausted },
|
||||
retryable: is_quota_transient,
|
||||
should_compress: false,
|
||||
should_rotate_credential: !is_quota_transient,
|
||||
retry_after: if is_quota_transient { Some(Duration::from_secs(30)) } else { None },
|
||||
message: if is_quota_transient { "使用限制,稍后重试".to_string() } else { "计费额度已耗尽".to_string() },
|
||||
}
|
||||
}
|
||||
429 => ClassifiedLlmError {
|
||||
kind: LlmErrorKind::RateLimited,
|
||||
retryable: true,
|
||||
should_compress: false,
|
||||
should_rotate_credential: true,
|
||||
retry_after: parse_retry_after(body),
|
||||
message: "速率限制".to_string(),
|
||||
},
|
||||
529 => ClassifiedLlmError {
|
||||
kind: LlmErrorKind::Overloaded,
|
||||
retryable: true,
|
||||
should_compress: false,
|
||||
should_rotate_credential: false,
|
||||
retry_after: Some(Duration::from_secs(5)),
|
||||
message: "提供商过载".to_string(),
|
||||
},
|
||||
500 | 502 => ClassifiedLlmError {
|
||||
kind: LlmErrorKind::ServerError,
|
||||
retryable: true,
|
||||
should_compress: false,
|
||||
should_rotate_credential: false,
|
||||
retry_after: None,
|
||||
message: "服务端错误".to_string(),
|
||||
},
|
||||
503 => ClassifiedLlmError {
|
||||
kind: LlmErrorKind::Overloaded,
|
||||
retryable: true,
|
||||
should_compress: false,
|
||||
should_rotate_credential: false,
|
||||
retry_after: Some(Duration::from_secs(3)),
|
||||
message: "服务暂时不可用".to_string(),
|
||||
},
|
||||
400 => {
|
||||
let is_context_overflow = body.contains("context_length")
|
||||
|| body.contains("max_tokens")
|
||||
|| body.contains("too many tokens")
|
||||
|| body.contains("prompt is too long");
|
||||
ClassifiedLlmError {
|
||||
kind: if is_context_overflow { LlmErrorKind::ContextOverflow } else { LlmErrorKind::Unknown },
|
||||
retryable: false,
|
||||
should_compress: is_context_overflow,
|
||||
should_rotate_credential: false,
|
||||
retry_after: None,
|
||||
message: if is_context_overflow {
|
||||
"上下文过长,需要压缩".to_string()
|
||||
} else {
|
||||
format!("请求错误: {}", &body[..body.len().min(200)])
|
||||
},
|
||||
}
|
||||
}
|
||||
404 => ClassifiedLlmError {
|
||||
kind: LlmErrorKind::ModelNotFound,
|
||||
retryable: false,
|
||||
should_compress: false,
|
||||
should_rotate_credential: false,
|
||||
retry_after: None,
|
||||
message: "模型不存在".to_string(),
|
||||
},
|
||||
_ => ClassifiedLlmError {
|
||||
kind: LlmErrorKind::Unknown,
|
||||
retryable: true,
|
||||
should_compress: false,
|
||||
should_rotate_credential: false,
|
||||
retry_after: None,
|
||||
message: format!("未知错误 ({}) {}", status, &body[..body.len().min(200)]),
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
fn parse_retry_after(body: &str) -> Option<Duration> {
|
||||
// Anthropic: "Please retry after X seconds"
|
||||
// OpenAI: "Please retry after Xms"
|
||||
if let Some(secs) = extract_retry_seconds(body) {
|
||||
return Some(Duration::from_secs(secs));
|
||||
}
|
||||
if let Some(ms) = extract_retry_millis(body) {
|
||||
return Some(Duration::from_millis(ms));
|
||||
}
|
||||
Some(Duration::from_secs(2))
|
||||
}
|
||||
|
||||
fn extract_retry_seconds(body: &str) -> Option<u64> {
|
||||
let re = regex::Regex::new(r"retry\s+(?:after\s+)?(\d+)\s*(?:s|sec|seconds?)").ok()?;
|
||||
let caps = re.captures(body)?;
|
||||
caps[1].parse().ok()
|
||||
}
|
||||
|
||||
fn extract_retry_millis(body: &str) -> Option<u64> {
|
||||
let re = regex::Regex::new(r"retry\s+(?:after\s+)?(\d+)\s*ms").ok()?;
|
||||
let caps = re.captures(body)?;
|
||||
caps[1].parse().ok()
|
||||
}
|
||||
@@ -238,6 +238,8 @@ impl LlmDriver for GeminiDriver {
|
||||
input_tokens,
|
||||
output_tokens,
|
||||
stop_reason: stop_reason.to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
});
|
||||
}
|
||||
}
|
||||
@@ -500,6 +502,8 @@ impl GeminiDriver {
|
||||
input_tokens,
|
||||
output_tokens,
|
||||
stop_reason,
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -238,6 +238,8 @@ impl LocalDriver {
|
||||
input_tokens,
|
||||
output_tokens,
|
||||
stop_reason,
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -396,6 +398,8 @@ impl LlmDriver for LocalDriver {
|
||||
input_tokens: 0,
|
||||
output_tokens: 0,
|
||||
stop_reason: "end_turn".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
});
|
||||
continue;
|
||||
}
|
||||
|
||||
@@ -15,11 +15,14 @@ mod anthropic;
|
||||
mod openai;
|
||||
mod gemini;
|
||||
mod local;
|
||||
mod error_classifier;
|
||||
mod retry_driver;
|
||||
|
||||
pub use anthropic::AnthropicDriver;
|
||||
pub use openai::OpenAiDriver;
|
||||
pub use gemini::GeminiDriver;
|
||||
pub use local::LocalDriver;
|
||||
pub use retry_driver::{RetryDriver, RetryConfig};
|
||||
|
||||
/// LLM Driver trait - unified interface for all providers
|
||||
#[async_trait]
|
||||
@@ -106,6 +109,12 @@ pub struct CompletionResponse {
|
||||
pub output_tokens: u32,
|
||||
/// Stop reason
|
||||
pub stop_reason: StopReason,
|
||||
/// Cache creation input tokens (Anthropic prompt caching)
|
||||
#[serde(default)]
|
||||
pub cache_creation_input_tokens: Option<u32>,
|
||||
/// Cache read input tokens (Anthropic prompt caching)
|
||||
#[serde(default)]
|
||||
pub cache_read_input_tokens: Option<u32>,
|
||||
}
|
||||
|
||||
/// LLM driver response content block (subset of canonical zclaw_types::ContentBlock).
|
||||
|
||||
@@ -222,10 +222,13 @@ impl LlmDriver for OpenAiDriver {
|
||||
let parsed_args: serde_json::Value = if args.is_empty() {
|
||||
serde_json::json!({})
|
||||
} else {
|
||||
serde_json::from_str(args).unwrap_or_else(|e| {
|
||||
tracing::warn!("[OpenAI] Failed to parse tool args '{}': {}, using empty object", args, e);
|
||||
serde_json::json!({})
|
||||
})
|
||||
match serde_json::from_str(args) {
|
||||
Ok(v) => v,
|
||||
Err(e) => {
|
||||
tracing::error!("[OpenAI] Failed to parse tool call '{}' args: {}. Raw: {}", name, e, &args[..args.len().min(200)]);
|
||||
serde_json::json!({ "_parse_error": e.to_string(), "_raw_args": args[..args.len().min(500)].to_string() })
|
||||
}
|
||||
}
|
||||
};
|
||||
yield Ok(StreamChunk::ToolUseEnd {
|
||||
id: id.clone(),
|
||||
@@ -237,6 +240,8 @@ impl LlmDriver for OpenAiDriver {
|
||||
input_tokens: 0,
|
||||
output_tokens: 0,
|
||||
stop_reason: "end_turn".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
});
|
||||
continue;
|
||||
}
|
||||
@@ -638,6 +643,8 @@ impl OpenAiDriver {
|
||||
input_tokens,
|
||||
output_tokens,
|
||||
stop_reason,
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -761,6 +768,8 @@ impl OpenAiDriver {
|
||||
StopReason::StopSequence => "stop",
|
||||
StopReason::Error => "error",
|
||||
}.to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
});
|
||||
})
|
||||
}
|
||||
|
||||
123
crates/zclaw-runtime/src/driver/retry_driver.rs
Normal file
123
crates/zclaw-runtime/src/driver/retry_driver.rs
Normal file
@@ -0,0 +1,123 @@
|
||||
//! RetryDriver: LlmDriver 的重试装饰器。
|
||||
//! 仅在本地 Kernel 路径使用,SaaS Relay 已有自己的重试逻辑。
|
||||
|
||||
use std::sync::Arc;
|
||||
use std::time::Duration;
|
||||
use async_trait::async_trait;
|
||||
use futures::Stream;
|
||||
use rand::Rng;
|
||||
use zclaw_types::{Result, ZclawError};
|
||||
|
||||
use super::{LlmDriver, CompletionRequest, CompletionResponse, StreamChunk};
|
||||
use super::error_classifier::classify_llm_error;
|
||||
|
||||
/// 重试配置
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct RetryConfig {
|
||||
pub max_attempts: u32,
|
||||
pub base_delay_secs: f64,
|
||||
pub max_delay_secs: f64,
|
||||
pub jitter_ratio: f64,
|
||||
}
|
||||
|
||||
impl Default for RetryConfig {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
max_attempts: 3,
|
||||
base_delay_secs: 1.0,
|
||||
max_delay_secs: 8.0,
|
||||
jitter_ratio: 0.5,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// 重试装饰器
|
||||
pub struct RetryDriver {
|
||||
inner: Arc<dyn LlmDriver>,
|
||||
config: RetryConfig,
|
||||
}
|
||||
|
||||
impl RetryDriver {
|
||||
pub fn new(inner: Arc<dyn LlmDriver>, config: RetryConfig) -> Self {
|
||||
Self { inner, config }
|
||||
}
|
||||
|
||||
fn jittered_backoff(&self, attempt: u32) -> Duration {
|
||||
let base = self.config.base_delay_secs * 2_f64.powi(attempt as i32);
|
||||
let capped = base.min(self.config.max_delay_secs);
|
||||
let mut rng = rand::thread_rng();
|
||||
let jitter = capped * self.config.jitter_ratio * rng.gen::<f64>();
|
||||
Duration::from_secs_f64(capped + jitter)
|
||||
}
|
||||
}
|
||||
|
||||
#[async_trait]
|
||||
impl LlmDriver for RetryDriver {
|
||||
fn provider(&self) -> &str {
|
||||
self.inner.provider()
|
||||
}
|
||||
|
||||
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse> {
|
||||
let mut last_error: Option<ZclawError> = None;
|
||||
|
||||
for attempt in 0..self.config.max_attempts {
|
||||
match self.inner.complete(request.clone()).await {
|
||||
Ok(response) => return Ok(response),
|
||||
Err(e) => {
|
||||
let message = e.to_string();
|
||||
let status = extract_status_from_error(&message);
|
||||
let classified = classify_llm_error(
|
||||
self.inner.provider(),
|
||||
status,
|
||||
&message,
|
||||
message.contains("timeout") || message.contains("Timeout"),
|
||||
);
|
||||
|
||||
if !classified.retryable {
|
||||
return Err(e);
|
||||
}
|
||||
|
||||
if classified.should_compress {
|
||||
return Err(ZclawError::LlmError(
|
||||
format!("[CONTEXT_OVERFLOW] {}", message)
|
||||
));
|
||||
}
|
||||
|
||||
last_error = Some(e);
|
||||
|
||||
if attempt + 1 < self.config.max_attempts {
|
||||
let delay = classified.retry_after
|
||||
.unwrap_or_else(|| self.jittered_backoff(attempt));
|
||||
tracing::warn!(
|
||||
"[RetryDriver] Attempt {}/{} failed ({}), retrying in {:.1}s",
|
||||
attempt + 1, self.config.max_attempts, classified.message,
|
||||
delay.as_secs_f64()
|
||||
);
|
||||
tokio::time::sleep(delay).await;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Err(last_error.unwrap_or_else(|| ZclawError::LlmError("重试耗尽".to_string())))
|
||||
}
|
||||
|
||||
fn stream(
|
||||
&self,
|
||||
request: CompletionRequest,
|
||||
) -> std::pin::Pin<Box<dyn Stream<Item = Result<StreamChunk>> + Send + '_>> {
|
||||
// 流式路径不重试——部分 delta 已发送,重试会导致 UI 重复
|
||||
self.inner.stream(request)
|
||||
}
|
||||
|
||||
fn is_configured(&self) -> bool {
|
||||
self.inner.is_configured()
|
||||
}
|
||||
}
|
||||
|
||||
fn extract_status_from_error(message: &str) -> u16 {
|
||||
let re = regex::Regex::new(r"(?:error|status)[:\s]+(\d{3})").ok();
|
||||
re.and_then(|re| re.captures(message))
|
||||
.and_then(|caps| caps[1].parse().ok())
|
||||
.unwrap_or(0)
|
||||
}
|
||||
@@ -4,10 +4,11 @@ use std::sync::Arc;
|
||||
use futures::StreamExt;
|
||||
use tokio::sync::mpsc;
|
||||
use zclaw_types::{AgentId, SessionId, Message, Result};
|
||||
use serde_json::Value;
|
||||
|
||||
use crate::driver::{LlmDriver, CompletionRequest, ContentBlock};
|
||||
use crate::stream::StreamChunk;
|
||||
use crate::tool::{ToolRegistry, ToolContext, SkillExecutor, HandExecutor};
|
||||
use crate::tool::{ToolRegistry, ToolContext, SkillExecutor, HandExecutor, ToolConcurrency};
|
||||
use crate::tool::builtin::PathValidator;
|
||||
use crate::growth::GrowthIntegration;
|
||||
use crate::compaction::{self, CompactionConfig};
|
||||
@@ -303,8 +304,28 @@ impl AgentLoop {
|
||||
plan_mode: self.plan_mode,
|
||||
};
|
||||
|
||||
// Call LLM
|
||||
let response = self.driver.complete(request).await?;
|
||||
// Call LLM with context-overflow recovery
|
||||
let response = match self.driver.complete(request).await {
|
||||
Ok(r) => r,
|
||||
Err(e) => {
|
||||
let err_str = e.to_string();
|
||||
if err_str.contains("[CONTEXT_OVERFLOW]") && self.compaction_threshold > 0 {
|
||||
tracing::warn!("[AgentLoop] Context overflow detected, triggering emergency compaction");
|
||||
let pruned = compaction::prune_tool_outputs(&mut messages);
|
||||
if pruned > 0 {
|
||||
tracing::info!("[AgentLoop] Emergency pruning removed {} tool outputs", pruned);
|
||||
}
|
||||
let keep_recent = messages.len().saturating_sub(messages.len() / 3);
|
||||
let (compacted, removed) = compaction::compact_messages(messages, keep_recent.max(4));
|
||||
if removed > 0 {
|
||||
tracing::info!("[AgentLoop] Emergency compaction removed {} messages", removed);
|
||||
messages = compacted;
|
||||
continue; // retry the iteration with compacted messages
|
||||
}
|
||||
}
|
||||
return Err(e);
|
||||
}
|
||||
};
|
||||
total_input_tokens += response.input_tokens;
|
||||
total_output_tokens += response.output_tokens;
|
||||
|
||||
@@ -375,21 +396,22 @@ impl AgentLoop {
|
||||
let tool_context = self.create_tool_context(session_id.clone());
|
||||
let mut abort_result: Option<AgentLoopResult> = None;
|
||||
let mut clarification_result: Option<AgentLoopResult> = None;
|
||||
for (id, name, input) in tool_calls {
|
||||
// Check if loop was already aborted
|
||||
if abort_result.is_some() {
|
||||
break;
|
||||
}
|
||||
|
||||
// Phase 1: Pre-process inputs + middleware checks (serial)
|
||||
struct ToolPlan {
|
||||
idx: usize,
|
||||
id: String,
|
||||
name: String,
|
||||
input: Value,
|
||||
}
|
||||
let mut plans: Vec<ToolPlan> = Vec::new();
|
||||
for (idx, (id, name, input)) in tool_calls.into_iter().enumerate() {
|
||||
if abort_result.is_some() { break; }
|
||||
|
||||
// GLM and other models sometimes send tool calls with empty arguments `{}`
|
||||
// Inject the last user message as a fallback query so the tool can infer intent.
|
||||
let input = if input.as_object().map_or(false, |obj| obj.is_empty()) {
|
||||
if let Some(last_user_msg) = messages.iter().rev().find_map(|m| {
|
||||
if let Message::User { content } = m {
|
||||
Some(content.clone())
|
||||
} else {
|
||||
None
|
||||
}
|
||||
if let Message::User { content } = m { Some(content.clone()) } else { None }
|
||||
}) {
|
||||
tracing::info!("[AgentLoop] Tool '{}' received empty input, injecting user message as fallback query", name);
|
||||
serde_json::json!({ "_fallback_query": last_user_msg })
|
||||
@@ -400,101 +422,152 @@ impl AgentLoop {
|
||||
input
|
||||
};
|
||||
|
||||
// Check tool call safety — via middleware chain
|
||||
{
|
||||
let mw_ctx_ref = middleware::MiddlewareContext {
|
||||
let mw_ctx = middleware::MiddlewareContext {
|
||||
agent_id: self.agent_id.clone(),
|
||||
session_id: session_id.clone(),
|
||||
user_input: input.to_string(),
|
||||
system_prompt: enhanced_prompt.clone(),
|
||||
messages: messages.clone(),
|
||||
response_content: Vec::new(),
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
};
|
||||
match self.middleware_chain.run_before_tool_call(&mw_ctx, &name, &input).await? {
|
||||
middleware::ToolCallDecision::Allow => {
|
||||
plans.push(ToolPlan { idx, id, name, input });
|
||||
}
|
||||
middleware::ToolCallDecision::Block(msg) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' blocked by middleware: {}", name, msg);
|
||||
messages.push(Message::tool_result(&id, zclaw_types::ToolId::new(&name), serde_json::json!({ "error": msg }), true));
|
||||
}
|
||||
middleware::ToolCallDecision::ReplaceInput(new_input) => {
|
||||
plans.push(ToolPlan { idx, id, name, input: new_input });
|
||||
}
|
||||
middleware::ToolCallDecision::AbortLoop(reason) => {
|
||||
tracing::warn!("[AgentLoop] Loop aborted by middleware: {}", reason);
|
||||
let msg = format!("{}\n已自动终止", reason);
|
||||
self.memory.append_message(&session_id, &Message::assistant(&msg)).await?;
|
||||
abort_result = Some(AgentLoopResult {
|
||||
response: msg,
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
iterations,
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Phase 2: Execute tools (parallel for ReadOnly, serial for others)
|
||||
if abort_result.is_none() && !plans.is_empty() {
|
||||
let (parallel_plans, sequential_plans): (Vec<_>, Vec<_>) = plans.iter()
|
||||
.partition(|p| {
|
||||
self.tools.get(&p.name)
|
||||
.map(|t| t.concurrency())
|
||||
.unwrap_or(ToolConcurrency::Exclusive) == ToolConcurrency::ReadOnly
|
||||
});
|
||||
|
||||
let mut results: std::collections::HashMap<usize, (String, String, serde_json::Value)> = std::collections::HashMap::new();
|
||||
|
||||
// Execute parallel (ReadOnly) tools with JoinSet (max 3 concurrent)
|
||||
if !parallel_plans.is_empty() {
|
||||
let semaphore = Arc::new(tokio::sync::Semaphore::new(3));
|
||||
let mut join_set = tokio::task::JoinSet::new();
|
||||
|
||||
for plan in ¶llel_plans {
|
||||
let tool = self.tools.get(&plan.name).unwrap();
|
||||
let ctx = tool_context.clone();
|
||||
let input = plan.input.clone();
|
||||
let idx = plan.idx;
|
||||
let id = plan.id.clone();
|
||||
let name = plan.name.clone();
|
||||
let permit = semaphore.clone().acquire_owned().await.unwrap();
|
||||
|
||||
join_set.spawn(async move {
|
||||
let result = tokio::time::timeout(
|
||||
std::time::Duration::from_secs(30),
|
||||
tool.execute(input, &ctx)
|
||||
).await;
|
||||
drop(permit);
|
||||
(idx, id, name, result)
|
||||
});
|
||||
}
|
||||
|
||||
while let Some(res) = join_set.join_next().await {
|
||||
match res {
|
||||
Ok((idx, id, name, Ok(Ok(value)))) => {
|
||||
results.insert(idx, (id, name, value));
|
||||
}
|
||||
Ok((idx, id, name, Ok(Err(e)))) => {
|
||||
results.insert(idx, (id, name, serde_json::json!({ "error": e.to_string() })));
|
||||
}
|
||||
Ok((idx, id, name, Err(_))) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' timed out after 30s (parallel)", name);
|
||||
results.insert(idx, (id, name.clone(), serde_json::json!({ "error": format!("工具 '{}' 执行超时(30秒),请重试", name) })));
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::warn!("[AgentLoop] JoinError in parallel tool execution: {}", e);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Execute sequential (Exclusive/Interactive) tools
|
||||
for plan in &sequential_plans {
|
||||
let tool_result = match tokio::time::timeout(
|
||||
std::time::Duration::from_secs(30),
|
||||
self.execute_tool(&plan.name, plan.input.clone(), &tool_context),
|
||||
).await {
|
||||
Ok(Ok(result)) => result,
|
||||
Ok(Err(e)) => serde_json::json!({ "error": e.to_string() }),
|
||||
Err(_) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' timed out after 30s", plan.name);
|
||||
serde_json::json!({ "error": format!("工具 '{}' 执行超时(30秒),请重试", plan.name) })
|
||||
}
|
||||
};
|
||||
|
||||
// Check if this is a clarification response
|
||||
if plan.name == "ask_clarification"
|
||||
&& tool_result.get("status").and_then(|v| v.as_str()) == Some("clarification_needed")
|
||||
{
|
||||
tracing::info!("[AgentLoop] Clarification requested, terminating loop");
|
||||
let question = tool_result.get("question")
|
||||
.and_then(|v| v.as_str())
|
||||
.unwrap_or("需要更多信息")
|
||||
.to_string();
|
||||
results.insert(plan.idx, (plan.id.clone(), plan.name.clone(), tool_result));
|
||||
self.memory.append_message(&session_id, &Message::assistant(&question)).await?;
|
||||
clarification_result = Some(AgentLoopResult {
|
||||
response: question,
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
iterations,
|
||||
});
|
||||
break;
|
||||
}
|
||||
results.insert(plan.idx, (plan.id.clone(), plan.name.clone(), tool_result));
|
||||
}
|
||||
|
||||
// Push results in original tool_call order
|
||||
let mut sorted_indices: Vec<usize> = results.keys().copied().collect();
|
||||
sorted_indices.sort();
|
||||
for idx in sorted_indices {
|
||||
let (id, name, result) = results.remove(&idx).unwrap();
|
||||
// Run after_tool_call middleware (error counting, output guard, etc.)
|
||||
let mut mw_ctx = middleware::MiddlewareContext {
|
||||
agent_id: self.agent_id.clone(),
|
||||
session_id: session_id.clone(),
|
||||
user_input: input.to_string(),
|
||||
user_input: String::new(),
|
||||
system_prompt: enhanced_prompt.clone(),
|
||||
messages: messages.clone(),
|
||||
response_content: Vec::new(),
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
};
|
||||
match self.middleware_chain.run_before_tool_call(&mw_ctx_ref, &name, &input).await? {
|
||||
middleware::ToolCallDecision::Allow => {}
|
||||
middleware::ToolCallDecision::Block(msg) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' blocked by middleware: {}", name, msg);
|
||||
let error_output = serde_json::json!({ "error": msg });
|
||||
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true));
|
||||
continue;
|
||||
}
|
||||
middleware::ToolCallDecision::ReplaceInput(new_input) => {
|
||||
// Execute with replaced input (with timeout)
|
||||
let tool_result = match tokio::time::timeout(
|
||||
std::time::Duration::from_secs(30),
|
||||
self.execute_tool(&name, new_input, &tool_context),
|
||||
).await {
|
||||
Ok(Ok(result)) => result,
|
||||
Ok(Err(e)) => serde_json::json!({ "error": e.to_string() }),
|
||||
Err(_) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' (replaced input) timed out after 30s", name);
|
||||
serde_json::json!({ "error": format!("工具 '{}' 执行超时(30秒),请重试", name) })
|
||||
}
|
||||
};
|
||||
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), tool_result, false));
|
||||
continue;
|
||||
}
|
||||
middleware::ToolCallDecision::AbortLoop(reason) => {
|
||||
tracing::warn!("[AgentLoop] Loop aborted by middleware: {}", reason);
|
||||
let msg = format!("{}\n已自动终止", reason);
|
||||
self.memory.append_message(&session_id, &Message::assistant(&msg)).await?;
|
||||
abort_result = Some(AgentLoopResult {
|
||||
response: msg,
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
iterations,
|
||||
});
|
||||
}
|
||||
if let Err(e) = self.middleware_chain.run_after_tool_call(&mut mw_ctx, &name, &result).await {
|
||||
tracing::warn!("[AgentLoop] after_tool_call middleware failed for '{}': {}", name, e);
|
||||
}
|
||||
messages.push(Message::tool_result(&id, zclaw_types::ToolId::new(&name), result, false));
|
||||
}
|
||||
|
||||
let tool_result = match tokio::time::timeout(
|
||||
std::time::Duration::from_secs(30),
|
||||
self.execute_tool(&name, input, &tool_context),
|
||||
).await {
|
||||
Ok(Ok(result)) => result,
|
||||
Ok(Err(e)) => serde_json::json!({ "error": e.to_string() }),
|
||||
Err(_) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' timed out after 30s", name);
|
||||
serde_json::json!({ "error": format!("工具 '{}' 执行超时(30秒),请重试", name) })
|
||||
}
|
||||
};
|
||||
|
||||
// Check if this is a clarification response — terminate loop immediately
|
||||
// so the LLM waits for user input instead of continuing to generate.
|
||||
if name == "ask_clarification"
|
||||
&& tool_result.get("status").and_then(|v| v.as_str()) == Some("clarification_needed")
|
||||
{
|
||||
tracing::info!("[AgentLoop] Clarification requested, terminating loop");
|
||||
let question = tool_result.get("question")
|
||||
.and_then(|v| v.as_str())
|
||||
.unwrap_or("需要更多信息")
|
||||
.to_string();
|
||||
messages.push(Message::tool_result(
|
||||
id,
|
||||
zclaw_types::ToolId::new(&name),
|
||||
tool_result,
|
||||
false,
|
||||
));
|
||||
self.memory.append_message(&session_id, &Message::assistant(&question)).await?;
|
||||
clarification_result = Some(AgentLoopResult {
|
||||
response: question,
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
iterations,
|
||||
});
|
||||
break;
|
||||
}
|
||||
|
||||
// Add tool result to messages
|
||||
messages.push(Message::tool_result(
|
||||
id,
|
||||
zclaw_types::ToolId::new(&name),
|
||||
tool_result,
|
||||
false, // is_error - we include errors in the result itself
|
||||
));
|
||||
}
|
||||
|
||||
// Continue the loop - LLM will process tool results and generate final response
|
||||
@@ -647,6 +720,7 @@ impl AgentLoop {
|
||||
|
||||
let mut stream = driver.stream(request);
|
||||
let mut pending_tool_calls: Vec<(String, String, serde_json::Value)> = Vec::new();
|
||||
let mut completed_tool_ids: std::collections::HashSet<String> = std::collections::HashSet::new();
|
||||
let mut iteration_text = String::new();
|
||||
let mut reasoning_text = String::new(); // Track reasoning separately for API requirement
|
||||
|
||||
@@ -703,6 +777,7 @@ impl AgentLoop {
|
||||
// Update with final parsed input and emit ToolStart event
|
||||
if let Some(tool) = pending_tool_calls.iter_mut().find(|(tid, _, _)| tid == id) {
|
||||
tool.2 = input.clone();
|
||||
completed_tool_ids.insert(id.clone());
|
||||
if let Err(e) = tx.send(LoopEvent::ToolStart { name: tool.1.clone(), input: input.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolStart event: {}", e);
|
||||
}
|
||||
@@ -810,10 +885,26 @@ impl AgentLoop {
|
||||
break 'outer;
|
||||
}
|
||||
|
||||
// Skip tool processing if stream errored or timed out
|
||||
// Handle stream errors — execute complete tool calls, cancel incomplete ones
|
||||
if stream_errored {
|
||||
tracing::debug!("[AgentLoop] Stream errored, skipping tool processing and breaking");
|
||||
break 'outer;
|
||||
// Cancel incomplete tools (ToolStart sent but ToolUseEnd not received)
|
||||
let incomplete: Vec<_> = pending_tool_calls.iter()
|
||||
.filter(|(id, _, _)| !completed_tool_ids.contains(id))
|
||||
.collect();
|
||||
for (_, name, _) in &incomplete {
|
||||
tracing::warn!("[AgentLoop] Cancelling incomplete tool '{}' due to stream error", name);
|
||||
let error_output = serde_json::json!({ "error": "流式响应中断,工具调用未完成" });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send cancellation ToolEnd event: {}", e);
|
||||
}
|
||||
}
|
||||
// Retain only complete tools for execution
|
||||
pending_tool_calls.retain(|(id, _, _)| completed_tool_ids.contains(id));
|
||||
if pending_tool_calls.is_empty() {
|
||||
tracing::debug!("[AgentLoop] Stream errored with no complete tool calls, breaking");
|
||||
break 'outer;
|
||||
}
|
||||
tracing::info!("[AgentLoop] Stream errored but executing {} complete tool calls", pending_tool_calls.len());
|
||||
}
|
||||
|
||||
tracing::debug!("[AgentLoop] Processing {} tool calls (reasoning: {} chars)", pending_tool_calls.len(), reasoning_text.len());
|
||||
@@ -830,187 +921,192 @@ impl AgentLoop {
|
||||
messages.push(Message::tool_use(id, zclaw_types::ToolId::new(name), input.clone()));
|
||||
}
|
||||
|
||||
// Execute tools
|
||||
for (id, name, input) in pending_tool_calls {
|
||||
tracing::debug!("[AgentLoop] Executing tool: name={}, input={:?}", name, input);
|
||||
// Execute tools — Phase 1: Pre-process through middleware (serial)
|
||||
struct StreamToolPlan { idx: usize, id: String, name: String, input: Value }
|
||||
let mut plans: Vec<StreamToolPlan> = Vec::new();
|
||||
let mut abort_loop = false;
|
||||
for (idx, (id, name, input)) in pending_tool_calls.into_iter().enumerate() {
|
||||
if abort_loop { break; }
|
||||
let mw_ctx = middleware::MiddlewareContext {
|
||||
agent_id: agent_id.clone(),
|
||||
session_id: session_id_clone.clone(),
|
||||
user_input: input.to_string(),
|
||||
system_prompt: enhanced_prompt.clone(),
|
||||
messages: messages.clone(),
|
||||
response_content: Vec::new(),
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
};
|
||||
match middleware_chain.run_before_tool_call(&mw_ctx, &name, &input).await {
|
||||
Ok(middleware::ToolCallDecision::Allow) => {
|
||||
plans.push(StreamToolPlan { idx, id, name, input });
|
||||
}
|
||||
Ok(middleware::ToolCallDecision::Block(msg)) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' blocked by middleware: {}", name, msg);
|
||||
let error_output = serde_json::json!({ "error": msg });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true));
|
||||
}
|
||||
Ok(middleware::ToolCallDecision::ReplaceInput(new_input)) => {
|
||||
plans.push(StreamToolPlan { idx, id, name, input: new_input });
|
||||
}
|
||||
Ok(middleware::ToolCallDecision::AbortLoop(reason)) => {
|
||||
tracing::warn!("[AgentLoop] Loop aborted by middleware: {}", reason);
|
||||
if let Err(e) = tx.send(LoopEvent::Error(reason)).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send Error event: {}", e);
|
||||
}
|
||||
abort_loop = true;
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::error!("[AgentLoop] Middleware error for tool '{}': {}", name, e);
|
||||
let error_output = serde_json::json!({ "error": e.to_string() });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true));
|
||||
}
|
||||
}
|
||||
}
|
||||
if abort_loop { break 'outer; }
|
||||
if plans.is_empty() {
|
||||
tracing::debug!("[AgentLoop] No tools to execute after middleware filtering");
|
||||
break 'outer;
|
||||
}
|
||||
|
||||
// Check tool call safety — via middleware chain
|
||||
// Build shared tool context
|
||||
let pv = path_validator.clone().unwrap_or_else(|| {
|
||||
let home = std::env::var("USERPROFILE")
|
||||
.or_else(|_| std::env::var("HOME"))
|
||||
.unwrap_or_else(|_| ".".to_string());
|
||||
PathValidator::new().with_workspace(std::path::PathBuf::from(&home))
|
||||
});
|
||||
let working_dir = pv.workspace_root().map(|p| p.to_string_lossy().to_string());
|
||||
let tool_context = ToolContext {
|
||||
agent_id: agent_id.clone(),
|
||||
working_directory: working_dir,
|
||||
session_id: Some(session_id_clone.to_string()),
|
||||
skill_executor: skill_executor.clone(),
|
||||
hand_executor: hand_executor.clone(),
|
||||
path_validator: Some(pv),
|
||||
event_sender: Some(tx.clone()),
|
||||
};
|
||||
|
||||
// Phase 2: Execute tools (parallel for ReadOnly, serial for others)
|
||||
let (parallel_plans, sequential_plans): (Vec<_>, Vec<_>) = plans.iter()
|
||||
.partition(|p| {
|
||||
tools.get(&p.name)
|
||||
.map(|t| t.concurrency())
|
||||
.unwrap_or(ToolConcurrency::Exclusive) == ToolConcurrency::ReadOnly
|
||||
});
|
||||
|
||||
let mut results: std::collections::HashMap<usize, (String, String, serde_json::Value, bool)> = std::collections::HashMap::new();
|
||||
|
||||
// Execute parallel (ReadOnly) tools with JoinSet (max 3 concurrent)
|
||||
if !parallel_plans.is_empty() {
|
||||
let sem = Arc::new(tokio::sync::Semaphore::new(3));
|
||||
let mut join_set = tokio::task::JoinSet::new();
|
||||
for plan in ¶llel_plans {
|
||||
let tool_ctx = tool_context.clone();
|
||||
let input = plan.input.clone();
|
||||
let idx = plan.idx;
|
||||
let id = plan.id.clone();
|
||||
let name = plan.name.clone();
|
||||
let tools_ref = tools.clone();
|
||||
let permit = sem.clone().acquire_owned().await.unwrap();
|
||||
join_set.spawn(async move {
|
||||
let result = if let Some(tool) = tools_ref.get(&name) {
|
||||
tokio::time::timeout(std::time::Duration::from_secs(30), tool.execute(input, &tool_ctx)).await
|
||||
} else {
|
||||
Ok(Err(zclaw_types::ZclawError::Internal(format!("Unknown tool: {}", name))))
|
||||
};
|
||||
drop(permit);
|
||||
(idx, id, name, result)
|
||||
});
|
||||
}
|
||||
while let Some(res) = join_set.join_next().await {
|
||||
match res {
|
||||
Ok((idx, id, name, Ok(Ok(value)))) => {
|
||||
results.insert(idx, (id, name, value, false));
|
||||
}
|
||||
Ok((idx, id, name, Ok(Err(e)))) => {
|
||||
results.insert(idx, (id, name, serde_json::json!({ "error": e.to_string() }), true));
|
||||
}
|
||||
Ok((idx, id, name, Err(_))) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' timed out (parallel, 30s)", name);
|
||||
results.insert(idx, (id, name.clone(), serde_json::json!({ "error": format!("工具 '{}' 执行超时", name) }), true));
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::warn!("[AgentLoop] JoinError in parallel tool execution: {}", e);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Execute sequential (Exclusive/Interactive) tools
|
||||
for plan in &sequential_plans {
|
||||
let (result, is_error) = if let Some(tool) = tools.get(&plan.name) {
|
||||
match tool.execute(plan.input.clone(), &tool_context).await {
|
||||
Ok(output) => (output, false),
|
||||
Err(e) => (serde_json::json!({ "error": e.to_string() }), true),
|
||||
}
|
||||
} else {
|
||||
(serde_json::json!({ "error": format!("Unknown tool: {}", plan.name) }), true)
|
||||
};
|
||||
|
||||
// Check clarification (only from sequential tools — ask_clarification is Interactive)
|
||||
if plan.name == "ask_clarification"
|
||||
&& result.get("status").and_then(|v| v.as_str()) == Some("clarification_needed")
|
||||
{
|
||||
let mw_ctx = middleware::MiddlewareContext {
|
||||
tracing::info!("[AgentLoop] Streaming: Clarification requested, terminating loop");
|
||||
let question = result.get("question").and_then(|v| v.as_str()).unwrap_or("需要更多信息").to_string();
|
||||
messages.push(Message::tool_result(plan.id.clone(), zclaw_types::ToolId::new(&plan.name), result, is_error));
|
||||
if let Err(e) = tx.send(LoopEvent::Delta(question.clone())).await { tracing::warn!("{}", e); }
|
||||
if let Err(e) = tx.send(LoopEvent::Complete(AgentLoopResult { response: question.clone(), input_tokens: total_input_tokens, output_tokens: total_output_tokens, iterations: iteration })).await { tracing::warn!("{}", e); }
|
||||
if let Err(e) = memory.append_message(&session_id_clone, &Message::assistant(&question)).await { tracing::warn!("{}", e); }
|
||||
break 'outer;
|
||||
}
|
||||
results.insert(plan.idx, (plan.id.clone(), plan.name.clone(), result, is_error));
|
||||
}
|
||||
|
||||
// Phase 3: after_tool_call middleware + push results in original order
|
||||
let mut sorted_indices: Vec<usize> = results.keys().copied().collect();
|
||||
sorted_indices.sort();
|
||||
for idx in sorted_indices {
|
||||
let (id, name, result, is_error) = results.remove(&idx).unwrap();
|
||||
|
||||
// Emit ToolEnd event
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: result.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
|
||||
// Run after_tool_call middleware
|
||||
{
|
||||
let mut mw_ctx = middleware::MiddlewareContext {
|
||||
agent_id: agent_id.clone(),
|
||||
session_id: session_id_clone.clone(),
|
||||
user_input: input.to_string(),
|
||||
user_input: String::new(),
|
||||
system_prompt: enhanced_prompt.clone(),
|
||||
messages: messages.clone(),
|
||||
response_content: Vec::new(),
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
};
|
||||
match middleware_chain.run_before_tool_call(&mw_ctx, &name, &input).await {
|
||||
Ok(middleware::ToolCallDecision::Allow) => {}
|
||||
Ok(middleware::ToolCallDecision::Block(msg)) => {
|
||||
tracing::warn!("[AgentLoop] Tool '{}' blocked by middleware: {}", name, msg);
|
||||
let error_output = serde_json::json!({ "error": msg });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true));
|
||||
continue;
|
||||
}
|
||||
Ok(middleware::ToolCallDecision::AbortLoop(reason)) => {
|
||||
tracing::warn!("[AgentLoop] Loop aborted by middleware: {}", reason);
|
||||
if let Err(e) = tx.send(LoopEvent::Error(reason)).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send Error event: {}", e);
|
||||
}
|
||||
break 'outer;
|
||||
}
|
||||
Ok(middleware::ToolCallDecision::ReplaceInput(new_input)) => {
|
||||
// Execute with replaced input (same path_validator logic below)
|
||||
let pv = path_validator.clone().unwrap_or_else(|| {
|
||||
let home = std::env::var("USERPROFILE")
|
||||
.or_else(|_| std::env::var("HOME"))
|
||||
.unwrap_or_else(|_| ".".to_string());
|
||||
PathValidator::new().with_workspace(std::path::PathBuf::from(&home))
|
||||
});
|
||||
let working_dir = pv.workspace_root()
|
||||
.map(|p| p.to_string_lossy().to_string());
|
||||
let tool_context = ToolContext {
|
||||
agent_id: agent_id.clone(),
|
||||
working_directory: working_dir,
|
||||
session_id: Some(session_id_clone.to_string()),
|
||||
skill_executor: skill_executor.clone(),
|
||||
hand_executor: hand_executor.clone(),
|
||||
path_validator: Some(pv),
|
||||
event_sender: Some(tx.clone()),
|
||||
};
|
||||
let (result, is_error) = if let Some(tool) = tools.get(&name) {
|
||||
match tool.execute(new_input, &tool_context).await {
|
||||
Ok(output) => {
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
(output, false)
|
||||
}
|
||||
Err(e) => {
|
||||
let error_output = serde_json::json!({ "error": e.to_string() });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
(error_output, true)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
let error_output = serde_json::json!({ "error": format!("Unknown tool: {}", name) });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
(error_output, true)
|
||||
};
|
||||
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), result, is_error));
|
||||
continue;
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::error!("[AgentLoop] Middleware error for tool '{}': {}", name, e);
|
||||
let error_output = serde_json::json!({ "error": e.to_string() });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true));
|
||||
continue;
|
||||
}
|
||||
if let Err(e) = middleware_chain.run_after_tool_call(&mut mw_ctx, &name, &result).await {
|
||||
tracing::warn!("[AgentLoop] after_tool_call middleware failed for '{}': {}", name, e);
|
||||
}
|
||||
}
|
||||
// Use pre-resolved path_validator (already has default fallback from create_tool_context logic)
|
||||
let pv = path_validator.clone().unwrap_or_else(|| {
|
||||
let home = std::env::var("USERPROFILE")
|
||||
.or_else(|_| std::env::var("HOME"))
|
||||
.unwrap_or_else(|_| ".".to_string());
|
||||
PathValidator::new().with_workspace(std::path::PathBuf::from(&home))
|
||||
});
|
||||
let working_dir = pv.workspace_root()
|
||||
.map(|p| p.to_string_lossy().to_string());
|
||||
let tool_context = ToolContext {
|
||||
agent_id: agent_id.clone(),
|
||||
working_directory: working_dir,
|
||||
session_id: Some(session_id_clone.to_string()),
|
||||
skill_executor: skill_executor.clone(),
|
||||
hand_executor: hand_executor.clone(),
|
||||
path_validator: Some(pv),
|
||||
event_sender: Some(tx.clone()),
|
||||
};
|
||||
|
||||
let (result, is_error) = if let Some(tool) = tools.get(&name) {
|
||||
tracing::debug!("[AgentLoop] Tool '{}' found, executing...", name);
|
||||
match tool.execute(input.clone(), &tool_context).await {
|
||||
Ok(output) => {
|
||||
tracing::debug!("[AgentLoop] Tool '{}' executed successfully: {:?}", name, output);
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
(output, false)
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::error!("[AgentLoop] Tool '{}' execution failed: {}", name, e);
|
||||
let error_output = serde_json::json!({ "error": e.to_string() });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
(error_output, true)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
tracing::error!("[AgentLoop] Tool '{}' not found in registry", name);
|
||||
let error_output = serde_json::json!({ "error": format!("Unknown tool: {}", name) });
|
||||
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
|
||||
}
|
||||
(error_output, true)
|
||||
};
|
||||
|
||||
// Check if this is a clarification response — break outer loop
|
||||
if name == "ask_clarification"
|
||||
&& result.get("status").and_then(|v| v.as_str()) == Some("clarification_needed")
|
||||
{
|
||||
tracing::info!("[AgentLoop] Streaming: Clarification requested, terminating loop");
|
||||
let question = result.get("question")
|
||||
.and_then(|v| v.as_str())
|
||||
.unwrap_or("需要更多信息")
|
||||
.to_string();
|
||||
messages.push(Message::tool_result(
|
||||
id,
|
||||
zclaw_types::ToolId::new(&name),
|
||||
result,
|
||||
is_error,
|
||||
));
|
||||
// Send the question as final delta so the user sees it
|
||||
if let Err(e) = tx.send(LoopEvent::Delta(question.clone())).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send Delta event: {}", e);
|
||||
}
|
||||
if let Err(e) = tx.send(LoopEvent::Complete(AgentLoopResult {
|
||||
response: question.clone(),
|
||||
input_tokens: total_input_tokens,
|
||||
output_tokens: total_output_tokens,
|
||||
iterations: iteration,
|
||||
})).await {
|
||||
tracing::warn!("[AgentLoop] Failed to send Complete event: {}", e);
|
||||
}
|
||||
if let Err(e) = memory.append_message(&session_id_clone, &Message::assistant(&question)).await {
|
||||
tracing::warn!("[AgentLoop] Failed to save clarification message: {}", e);
|
||||
}
|
||||
break 'outer;
|
||||
}
|
||||
|
||||
// Add tool result to message history
|
||||
tracing::debug!("[AgentLoop] Adding tool_result to history: id={}, name={}, is_error={}", id, name, is_error);
|
||||
messages.push(Message::tool_result(
|
||||
id,
|
||||
zclaw_types::ToolId::new(&name),
|
||||
result,
|
||||
is_error,
|
||||
));
|
||||
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), result, is_error));
|
||||
}
|
||||
|
||||
tracing::debug!("[AgentLoop] Continuing to next iteration for LLM to process tool results");
|
||||
// If stream errored, we executed complete tools but cannot continue the LLM loop
|
||||
if stream_errored {
|
||||
tracing::info!("[AgentLoop] Stream was errored — executed salvageable tools, now breaking");
|
||||
break 'outer;
|
||||
}
|
||||
// Continue loop - next iteration will call LLM with tool results
|
||||
}
|
||||
});
|
||||
|
||||
@@ -12,6 +12,13 @@
|
||||
//! | 200-399 | Capability | SkillIndex, Guardrail |
|
||||
//! | 400-599 | Safety | LoopGuard, Guardrail |
|
||||
//! | 600-799 | Telemetry | TokenCalibration, Tracking |
|
||||
//!
|
||||
//! # Wave parallelization
|
||||
//!
|
||||
//! `before_completion` middlewares that only modify `system_prompt` (not `messages`)
|
||||
//! can declare `parallel_safe() == true`. The chain runs consecutive parallel-safe
|
||||
//! middlewares concurrently, merging their prompt contributions. This reduces
|
||||
//! sequential latency for the context-injection phase.
|
||||
|
||||
use std::sync::Arc;
|
||||
use async_trait::async_trait;
|
||||
@@ -50,6 +57,7 @@ pub enum ToolCallDecision {
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Carries the mutable state that middleware may inspect or modify.
|
||||
#[derive(Clone)]
|
||||
pub struct MiddlewareContext {
|
||||
/// The agent that owns this loop.
|
||||
pub agent_id: AgentId,
|
||||
@@ -101,6 +109,15 @@ pub trait AgentMiddleware: Send + Sync {
|
||||
500
|
||||
}
|
||||
|
||||
/// Whether `before_completion` is safe to run concurrently with other
|
||||
/// parallel-safe middlewares. Only return `true` if the middleware:
|
||||
/// - Only modifies `ctx.system_prompt` (never `ctx.messages`)
|
||||
/// - Does not depend on prompt modifications from other middlewares
|
||||
/// - Does not return `MiddlewareDecision::Stop`
|
||||
fn parallel_safe(&self) -> bool {
|
||||
false
|
||||
}
|
||||
|
||||
/// Hook executed **before** the LLM completion request is sent.
|
||||
///
|
||||
/// Use this to inject context (memory, skill index, etc.) or to
|
||||
@@ -163,15 +180,74 @@ impl MiddlewareChain {
|
||||
self.middlewares.insert(pos, mw);
|
||||
}
|
||||
|
||||
/// Run all `before_completion` hooks in order.
|
||||
/// Run all `before_completion` hooks with wave-based parallelization.
|
||||
///
|
||||
/// Consecutive `parallel_safe` middlewares run concurrently — each gets
|
||||
/// its own cloned context and appends to `system_prompt` independently.
|
||||
/// Their contributions are merged after all complete. Non-parallel-safe
|
||||
/// middlewares (and non-consecutive ones) run sequentially as before.
|
||||
pub async fn run_before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> {
|
||||
for mw in &self.middlewares {
|
||||
match mw.before_completion(ctx).await? {
|
||||
MiddlewareDecision::Continue => {}
|
||||
MiddlewareDecision::Stop(reason) => {
|
||||
tracing::info!("[MiddlewareChain] '{}' requested stop: {}", mw.name(), reason);
|
||||
return Ok(MiddlewareDecision::Stop(reason));
|
||||
let mut idx = 0;
|
||||
while idx < self.middlewares.len() {
|
||||
// Find the extent of consecutive parallel-safe middlewares
|
||||
let wave_start = idx;
|
||||
let mut wave_end = idx;
|
||||
while wave_end < self.middlewares.len()
|
||||
&& self.middlewares[wave_end].parallel_safe()
|
||||
{
|
||||
wave_end += 1;
|
||||
}
|
||||
|
||||
if wave_end - wave_start >= 2 {
|
||||
// Run parallel wave (2+ consecutive parallel-safe middlewares)
|
||||
let base_prompt_len = ctx.system_prompt.len();
|
||||
let wave = &self.middlewares[wave_start..wave_end];
|
||||
|
||||
// Spawn concurrent tasks — each owns its cloned context + Arc ref to middleware
|
||||
let mut join_handles = Vec::with_capacity(wave.len());
|
||||
for mw in wave.iter() {
|
||||
let mut ctx_clone = ctx.clone();
|
||||
let mw_arc = Arc::clone(mw);
|
||||
join_handles.push(tokio::spawn(async move {
|
||||
let result = mw_arc.before_completion(&mut ctx_clone).await;
|
||||
(result, ctx_clone.system_prompt)
|
||||
}));
|
||||
}
|
||||
|
||||
// Await all and merge prompt contributions
|
||||
for (i, handle) in join_handles.into_iter().enumerate() {
|
||||
let (result, modified_prompt): (Result<MiddlewareDecision>, String) = handle.await
|
||||
.map_err(|e| zclaw_types::ZclawError::Internal(format!("Parallel middleware panicked: {}", e)))?;
|
||||
match result? {
|
||||
MiddlewareDecision::Continue => {}
|
||||
MiddlewareDecision::Stop(reason) => {
|
||||
tracing::info!(
|
||||
"[MiddlewareChain] '{}' requested stop: {}",
|
||||
self.middlewares[wave_start + i].name(),
|
||||
reason
|
||||
);
|
||||
return Ok(MiddlewareDecision::Stop(reason));
|
||||
}
|
||||
}
|
||||
// Merge system_prompt contribution from this clone
|
||||
if modified_prompt.len() > base_prompt_len {
|
||||
let contribution = &modified_prompt[base_prompt_len..];
|
||||
ctx.system_prompt.push_str(contribution);
|
||||
}
|
||||
}
|
||||
|
||||
idx = wave_end;
|
||||
} else {
|
||||
// Run single middleware sequentially
|
||||
let mw = &self.middlewares[idx];
|
||||
match mw.before_completion(ctx).await? {
|
||||
MiddlewareDecision::Continue => {}
|
||||
MiddlewareDecision::Stop(reason) => {
|
||||
tracing::info!("[MiddlewareChain] '{}' requested stop: {}", mw.name(), reason);
|
||||
return Ok(MiddlewareDecision::Stop(reason));
|
||||
}
|
||||
}
|
||||
idx += 1;
|
||||
}
|
||||
}
|
||||
Ok(MiddlewareDecision::Continue)
|
||||
|
||||
@@ -290,6 +290,8 @@ impl AgentMiddleware for ButlerRouterMiddleware {
|
||||
80
|
||||
}
|
||||
|
||||
fn parallel_safe(&self) -> bool { true }
|
||||
|
||||
async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> {
|
||||
// Only route on the first user message in a turn (not tool results)
|
||||
let user_input = &ctx.user_input;
|
||||
|
||||
@@ -1,21 +1,49 @@
|
||||
//! Compaction middleware — wraps the existing compaction module.
|
||||
//!
|
||||
//! Supports debounce (cooldown + min-round checks), async LLM compression
|
||||
//! with cached fallback, and iterative summaries that carry forward key info.
|
||||
|
||||
use async_trait::async_trait;
|
||||
use zclaw_types::Result;
|
||||
use crate::middleware::{AgentMiddleware, MiddlewareContext, MiddlewareDecision};
|
||||
use crate::compaction::{self, CompactionConfig};
|
||||
use crate::growth::GrowthIntegration;
|
||||
use crate::driver::LlmDriver;
|
||||
use std::sync::atomic::{AtomicU64, Ordering};
|
||||
use std::sync::Arc;
|
||||
use tokio::sync::RwLock;
|
||||
use zclaw_types::{Message, Result};
|
||||
use crate::compaction::{self, CompactionConfig};
|
||||
use crate::driver::LlmDriver;
|
||||
use crate::growth::GrowthIntegration;
|
||||
use crate::middleware::{AgentMiddleware, MiddlewareContext, MiddlewareDecision};
|
||||
|
||||
/// Minimum seconds between consecutive compactions.
|
||||
const COMPACTION_COOLDOWN_SECS: u64 = 30;
|
||||
/// Minimum message pairs (user+assistant) since last compaction before triggering again.
|
||||
const COMPACTION_MIN_ROUNDS: u64 = 3;
|
||||
|
||||
fn now_millis() -> u64 {
|
||||
std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.unwrap_or_default()
|
||||
.as_millis() as u64
|
||||
}
|
||||
|
||||
/// Shared compaction debounce state (lock-free).
|
||||
struct CompactionState {
|
||||
last_compaction_ms: AtomicU64,
|
||||
last_compaction_msg_count: AtomicU64,
|
||||
}
|
||||
|
||||
/// Cached result from a previous async LLM compaction.
|
||||
struct AsyncCompactionCache {
|
||||
last_result: RwLock<Option<Vec<Message>>>,
|
||||
}
|
||||
|
||||
/// Middleware that compresses conversation history when it exceeds a token threshold.
|
||||
pub struct CompactionMiddleware {
|
||||
threshold: usize,
|
||||
config: CompactionConfig,
|
||||
/// Optional LLM driver for async compaction (LLM summarisation, memory flush).
|
||||
driver: Option<Arc<dyn LlmDriver>>,
|
||||
/// Optional growth integration for memory flushing during compaction.
|
||||
growth: Option<GrowthIntegration>,
|
||||
state: Arc<CompactionState>,
|
||||
cache: Arc<AsyncCompactionCache>,
|
||||
}
|
||||
|
||||
impl CompactionMiddleware {
|
||||
@@ -25,7 +53,39 @@ impl CompactionMiddleware {
|
||||
driver: Option<Arc<dyn LlmDriver>>,
|
||||
growth: Option<GrowthIntegration>,
|
||||
) -> Self {
|
||||
Self { threshold, config, driver, growth }
|
||||
Self {
|
||||
threshold,
|
||||
config,
|
||||
driver,
|
||||
growth,
|
||||
state: Arc::new(CompactionState {
|
||||
last_compaction_ms: AtomicU64::new(0),
|
||||
last_compaction_msg_count: AtomicU64::new(0),
|
||||
}),
|
||||
cache: Arc::new(AsyncCompactionCache {
|
||||
last_result: RwLock::new(None),
|
||||
}),
|
||||
}
|
||||
}
|
||||
|
||||
fn should_compact(&self, msg_count: u64) -> bool {
|
||||
let last_ms = self.state.last_compaction_ms.load(Ordering::Relaxed);
|
||||
let last_count = self.state.last_compaction_msg_count.load(Ordering::Relaxed);
|
||||
|
||||
if now_millis().saturating_sub(last_ms) < COMPACTION_COOLDOWN_SECS * 1000 {
|
||||
return false;
|
||||
}
|
||||
|
||||
if msg_count.saturating_sub(last_count) < COMPACTION_MIN_ROUNDS * 2 {
|
||||
return false;
|
||||
}
|
||||
|
||||
true
|
||||
}
|
||||
|
||||
fn record_compaction(&self, msg_count: u64) {
|
||||
self.state.last_compaction_ms.store(now_millis(), Ordering::Relaxed);
|
||||
self.state.last_compaction_msg_count.store(msg_count, Ordering::Relaxed);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -39,6 +99,29 @@ impl AgentMiddleware for CompactionMiddleware {
|
||||
return Ok(MiddlewareDecision::Continue);
|
||||
}
|
||||
|
||||
// Step 1: Prune old tool outputs (cheap, no LLM needed)
|
||||
let pruned = compaction::prune_tool_outputs(&mut ctx.messages);
|
||||
if pruned > 0 {
|
||||
tracing::info!("[CompactionMiddleware] Pruned {} old tool outputs", pruned);
|
||||
}
|
||||
|
||||
// Step 2: Re-estimate tokens after pruning
|
||||
let tokens = compaction::estimate_messages_tokens_calibrated(&ctx.messages);
|
||||
if tokens < self.threshold {
|
||||
return Ok(MiddlewareDecision::Continue);
|
||||
}
|
||||
|
||||
// Step 3: Debounce check
|
||||
if !self.should_compact(ctx.messages.len() as u64) {
|
||||
// Still over threshold but within cooldown — use cached result if available
|
||||
if let Some(cached) = self.cache.last_result.read().await.clone() {
|
||||
tracing::debug!("[CompactionMiddleware] Cooldown active, using cached compaction result");
|
||||
ctx.messages = cached;
|
||||
}
|
||||
return Ok(MiddlewareDecision::Continue);
|
||||
}
|
||||
|
||||
// Step 4: Execute compaction
|
||||
let needs_async = self.config.use_llm || self.config.memory_flush_enabled;
|
||||
if needs_async {
|
||||
let outcome = compaction::maybe_compact_with_config(
|
||||
@@ -56,6 +139,14 @@ impl AgentMiddleware for CompactionMiddleware {
|
||||
ctx.messages = compaction::maybe_compact(ctx.messages.clone(), self.threshold);
|
||||
}
|
||||
|
||||
self.record_compaction(ctx.messages.len() as u64);
|
||||
|
||||
// Cache result for cooldown fallback
|
||||
{
|
||||
let mut cache = self.cache.last_result.write().await;
|
||||
*cache = Some(ctx.messages.clone());
|
||||
}
|
||||
|
||||
Ok(MiddlewareDecision::Continue)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -88,6 +88,8 @@ impl AgentMiddleware for EvolutionMiddleware {
|
||||
78 // 在 ButlerRouter(80) 之前
|
||||
}
|
||||
|
||||
fn parallel_safe(&self) -> bool { true }
|
||||
|
||||
async fn before_completion(
|
||||
&self,
|
||||
ctx: &mut MiddlewareContext,
|
||||
|
||||
@@ -111,6 +111,7 @@ impl MemoryMiddleware {
|
||||
impl AgentMiddleware for MemoryMiddleware {
|
||||
fn name(&self) -> &str { "memory" }
|
||||
fn priority(&self) -> i32 { 150 }
|
||||
fn parallel_safe(&self) -> bool { true }
|
||||
|
||||
async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> {
|
||||
tracing::debug!(
|
||||
|
||||
@@ -40,6 +40,7 @@ impl SkillIndexMiddleware {
|
||||
impl AgentMiddleware for SkillIndexMiddleware {
|
||||
fn name(&self) -> &str { "skill_index" }
|
||||
fn priority(&self) -> i32 { 200 }
|
||||
fn parallel_safe(&self) -> bool { true }
|
||||
|
||||
async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> {
|
||||
if self.entries.is_empty() {
|
||||
|
||||
@@ -41,6 +41,7 @@ impl Default for TitleMiddleware {
|
||||
impl AgentMiddleware for TitleMiddleware {
|
||||
fn name(&self) -> &str { "title" }
|
||||
fn priority(&self) -> i32 { 180 }
|
||||
fn parallel_safe(&self) -> bool { true }
|
||||
|
||||
// All hooks default to Continue — placeholder until LLM driver is wired in.
|
||||
async fn before_completion(&self, _ctx: &mut crate::middleware::MiddlewareContext) -> zclaw_types::Result<MiddlewareDecision> {
|
||||
|
||||
@@ -13,6 +13,7 @@ use serde_json::Value;
|
||||
use zclaw_types::Result;
|
||||
use crate::driver::ContentBlock;
|
||||
use crate::middleware::{AgentMiddleware, MiddlewareContext, ToolCallDecision};
|
||||
use std::collections::HashMap;
|
||||
use std::sync::Mutex;
|
||||
|
||||
/// Middleware that intercepts tool call errors and formats recovery messages.
|
||||
@@ -23,8 +24,8 @@ pub struct ToolErrorMiddleware {
|
||||
max_error_length: usize,
|
||||
/// Maximum consecutive failures before aborting the loop.
|
||||
max_consecutive_failures: u32,
|
||||
/// Tracks consecutive tool failures.
|
||||
consecutive_failures: Mutex<u32>,
|
||||
/// Tracks consecutive tool failures per session.
|
||||
session_failures: Mutex<HashMap<String, u32>>,
|
||||
}
|
||||
|
||||
impl ToolErrorMiddleware {
|
||||
@@ -32,7 +33,7 @@ impl ToolErrorMiddleware {
|
||||
Self {
|
||||
max_error_length: 500,
|
||||
max_consecutive_failures: 3,
|
||||
consecutive_failures: Mutex::new(0),
|
||||
session_failures: Mutex::new(HashMap::new()),
|
||||
}
|
||||
}
|
||||
|
||||
@@ -66,7 +67,7 @@ impl AgentMiddleware for ToolErrorMiddleware {
|
||||
|
||||
async fn before_tool_call(
|
||||
&self,
|
||||
_ctx: &MiddlewareContext,
|
||||
ctx: &MiddlewareContext,
|
||||
tool_name: &str,
|
||||
tool_input: &Value,
|
||||
) -> Result<ToolCallDecision> {
|
||||
@@ -79,15 +80,17 @@ impl AgentMiddleware for ToolErrorMiddleware {
|
||||
return Ok(ToolCallDecision::ReplaceInput(serde_json::json!({})));
|
||||
}
|
||||
|
||||
// Check consecutive failure count — abort if too many failures
|
||||
let failures = self.consecutive_failures.lock().unwrap_or_else(|e| e.into_inner());
|
||||
if *failures >= self.max_consecutive_failures {
|
||||
// Check consecutive failure count — abort if too many failures (per session)
|
||||
let failures = self.session_failures.lock()
|
||||
.map(|m| m.get(&ctx.session_id.to_string()).copied().unwrap_or(0))
|
||||
.unwrap_or(0);
|
||||
if failures >= self.max_consecutive_failures {
|
||||
tracing::warn!(
|
||||
"[ToolErrorMiddleware] Aborting loop: {} consecutive tool failures",
|
||||
*failures
|
||||
failures
|
||||
);
|
||||
return Ok(ToolCallDecision::AbortLoop(
|
||||
format!("连续 {} 次工具调用失败,已自动终止以避免无限重试", *failures)
|
||||
format!("连续 {} 次工具调用失败,已自动终止以避免无限重试", failures)
|
||||
));
|
||||
}
|
||||
|
||||
@@ -100,11 +103,16 @@ impl AgentMiddleware for ToolErrorMiddleware {
|
||||
tool_name: &str,
|
||||
result: &Value,
|
||||
) -> Result<()> {
|
||||
let mut failures = self.consecutive_failures.lock().unwrap_or_else(|e| e.into_inner());
|
||||
|
||||
// Check if the tool result indicates an error.
|
||||
if let Some(error) = result.get("error") {
|
||||
*failures += 1;
|
||||
let session_key = ctx.session_id.to_string();
|
||||
let failures = self.session_failures.lock()
|
||||
.map(|mut m| {
|
||||
let count = m.entry(session_key.clone()).or_insert(0);
|
||||
*count += 1;
|
||||
*count
|
||||
})
|
||||
.unwrap_or(1);
|
||||
let error_msg = match error {
|
||||
Value::String(s) => s.clone(),
|
||||
other => other.to_string(),
|
||||
@@ -118,7 +126,7 @@ impl AgentMiddleware for ToolErrorMiddleware {
|
||||
|
||||
tracing::warn!(
|
||||
"[ToolErrorMiddleware] Tool '{}' failed ({}/{} consecutive): {}",
|
||||
tool_name, *failures, self.max_consecutive_failures, truncated
|
||||
tool_name, failures, self.max_consecutive_failures, truncated
|
||||
);
|
||||
|
||||
let guided_message = self.format_tool_error(tool_name, &truncated);
|
||||
@@ -126,8 +134,11 @@ impl AgentMiddleware for ToolErrorMiddleware {
|
||||
text: guided_message,
|
||||
});
|
||||
} else {
|
||||
// Success — reset consecutive failure counter
|
||||
*failures = 0;
|
||||
// Success — reset consecutive failure counter for this session
|
||||
let session_key = ctx.session_id.to_string();
|
||||
if let Ok(mut m) = self.session_failures.lock() {
|
||||
m.insert(session_key, 0);
|
||||
}
|
||||
}
|
||||
|
||||
Ok(())
|
||||
|
||||
@@ -21,35 +21,27 @@ use crate::middleware::{AgentMiddleware, MiddlewareContext, ToolCallDecision};
|
||||
/// Maximum safe output length in characters.
|
||||
const MAX_OUTPUT_LENGTH: usize = 50_000;
|
||||
|
||||
/// Patterns that indicate sensitive information in tool output.
|
||||
const SENSITIVE_PATTERNS: &[&str] = &[
|
||||
"api_key",
|
||||
"apikey",
|
||||
"api-key",
|
||||
"secret_key",
|
||||
"secretkey",
|
||||
"access_token",
|
||||
"auth_token",
|
||||
"password",
|
||||
"private_key",
|
||||
"-----BEGIN RSA",
|
||||
"-----BEGIN PRIVATE",
|
||||
"sk-", // OpenAI API keys
|
||||
"sk_live_", // Stripe keys
|
||||
"AKIA", // AWS access keys
|
||||
/// Regex patterns that match actual secret values (not just keywords).
|
||||
/// These detect the *value format* of secrets, avoiding false positives
|
||||
/// from legitimate content that merely mentions "password" or "api_key".
|
||||
const SECRET_VALUE_PATTERNS: &[&str] = &[
|
||||
r#"sk-[a-zA-Z0-9]{20,}"#, // OpenAI API keys (sk-xxx, 20+ chars)
|
||||
r#"sk_live_[a-zA-Z0-9]{20,}"#, // Stripe live keys
|
||||
r#"sk_test_[a-zA-Z0-9]{20,}"#, // Stripe test keys
|
||||
r#"AKIA[A-Z0-9]{16}"#, // AWS access keys (exact 20 chars)
|
||||
r#"-----BEGIN (RSA |EC )?PRIVATE KEY-----"#, // PEM private keys
|
||||
r#"(?:api_?key|secret_?key|access_?token|auth_?token|password)\s*[:=]\s*["'][^"']{8,}["']"#, // key=value with actual secret
|
||||
];
|
||||
|
||||
/// Patterns that may indicate prompt injection in tool output.
|
||||
/// Keyword patterns that indicate prompt injection in tool output.
|
||||
/// These are specific enough to avoid false positives from normal content.
|
||||
const INJECTION_PATTERNS: &[&str] = &[
|
||||
"ignore previous instructions",
|
||||
"ignore all previous",
|
||||
"disregard your instructions",
|
||||
"you are now",
|
||||
"new instructions:",
|
||||
"system:",
|
||||
"[INST]",
|
||||
"</scratchpad>",
|
||||
"think step by step about",
|
||||
];
|
||||
|
||||
/// Tool output sanitization middleware.
|
||||
@@ -105,22 +97,24 @@ impl AgentMiddleware for ToolOutputGuardMiddleware {
|
||||
);
|
||||
}
|
||||
|
||||
// Rule 2: Sensitive information detection — block output containing secrets (P2-22)
|
||||
let output_lower = output_str.to_lowercase();
|
||||
for pattern in SENSITIVE_PATTERNS {
|
||||
if output_lower.contains(pattern) {
|
||||
tracing::error!(
|
||||
"[ToolOutputGuard] BLOCKED tool '{}' output: sensitive pattern '{}'",
|
||||
tool_name, pattern
|
||||
);
|
||||
return Err(zclaw_types::ZclawError::Internal(format!(
|
||||
"[ToolOutputGuard] Tool '{}' output blocked: sensitive information detected ('{}')",
|
||||
tool_name, pattern
|
||||
)));
|
||||
// Rule 2: Sensitive information detection — match actual secret values, not keywords
|
||||
for pattern in SECRET_VALUE_PATTERNS {
|
||||
if let Ok(re) = regex::Regex::new(pattern) {
|
||||
if re.is_match(&output_str) {
|
||||
tracing::error!(
|
||||
"[ToolOutputGuard] BLOCKED tool '{}' output: secret value matched pattern '{}'",
|
||||
tool_name, pattern
|
||||
);
|
||||
return Err(zclaw_types::ZclawError::Internal(format!(
|
||||
"[ToolOutputGuard] Tool '{}' output blocked: sensitive information detected",
|
||||
tool_name
|
||||
)));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Rule 3: Injection marker detection — BLOCK the output (P2-22 fix)
|
||||
// Rule 3: Injection marker detection — specific phrase matching
|
||||
let output_lower = output_str.to_lowercase();
|
||||
for pattern in INJECTION_PATTERNS {
|
||||
if output_lower.contains(pattern) {
|
||||
tracing::error!(
|
||||
|
||||
@@ -24,6 +24,10 @@ pub enum StreamChunk {
|
||||
input_tokens: u32,
|
||||
output_tokens: u32,
|
||||
stop_reason: String,
|
||||
#[serde(default)]
|
||||
cache_creation_input_tokens: Option<u32>,
|
||||
#[serde(default)]
|
||||
cache_read_input_tokens: Option<u32>,
|
||||
},
|
||||
/// Error occurred
|
||||
Error { message: String },
|
||||
|
||||
@@ -55,6 +55,8 @@ impl MockLlmDriver {
|
||||
input_tokens: 10,
|
||||
output_tokens: text.len() as u32 / 4,
|
||||
stop_reason: StopReason::EndTurn,
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
});
|
||||
self
|
||||
}
|
||||
@@ -74,6 +76,8 @@ impl MockLlmDriver {
|
||||
input_tokens: 10,
|
||||
output_tokens: 20,
|
||||
stop_reason: StopReason::ToolUse,
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
});
|
||||
self
|
||||
}
|
||||
@@ -86,6 +90,8 @@ impl MockLlmDriver {
|
||||
input_tokens: 0,
|
||||
output_tokens: 0,
|
||||
stop_reason: StopReason::Error,
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
});
|
||||
self
|
||||
}
|
||||
@@ -142,6 +148,8 @@ impl MockLlmDriver {
|
||||
input_tokens: 0,
|
||||
output_tokens: 0,
|
||||
stop_reason: StopReason::EndTurn,
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
})
|
||||
}
|
||||
}
|
||||
@@ -190,6 +198,8 @@ impl LlmDriver for MockLlmDriver {
|
||||
input_tokens: 10,
|
||||
output_tokens: 2,
|
||||
stop_reason: "end_turn".to_string(),
|
||||
cache_creation_input_tokens: None,
|
||||
cache_read_input_tokens: None,
|
||||
},
|
||||
]
|
||||
})
|
||||
|
||||
@@ -11,6 +11,17 @@ use crate::driver::ToolDefinition;
|
||||
use crate::loop_runner::LoopEvent;
|
||||
use crate::tool::builtin::PathValidator;
|
||||
|
||||
/// Tool concurrency safety level
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub enum ToolConcurrency {
|
||||
/// Read-only operations, always safe to parallelize (file_read, web_fetch, etc.)
|
||||
ReadOnly,
|
||||
/// Exclusive operations, must be serial (file_write, shell_exec, etc.)
|
||||
Exclusive,
|
||||
/// Interactive operations, never parallelize (ask_clarification, etc.)
|
||||
Interactive,
|
||||
}
|
||||
|
||||
/// Tool trait for implementing agent tools
|
||||
#[async_trait]
|
||||
pub trait Tool: Send + Sync {
|
||||
@@ -25,6 +36,11 @@ pub trait Tool: Send + Sync {
|
||||
|
||||
/// Execute the tool
|
||||
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value>;
|
||||
|
||||
/// Tool concurrency safety level. Default: ReadOnly.
|
||||
fn concurrency(&self) -> ToolConcurrency {
|
||||
ToolConcurrency::ReadOnly
|
||||
}
|
||||
}
|
||||
|
||||
/// Skill executor trait for runtime skill execution
|
||||
|
||||
@@ -9,7 +9,7 @@ use async_trait::async_trait;
|
||||
use serde_json::{json, Value};
|
||||
use zclaw_types::{Result, ZclawError};
|
||||
|
||||
use crate::tool::{Tool, ToolContext};
|
||||
use crate::tool::{Tool, ToolContext, ToolConcurrency};
|
||||
|
||||
/// Clarification type — categorizes the reason for asking.
|
||||
#[derive(Debug, Clone, PartialEq)]
|
||||
@@ -96,6 +96,10 @@ impl Tool for AskClarificationTool {
|
||||
})
|
||||
}
|
||||
|
||||
fn concurrency(&self) -> ToolConcurrency {
|
||||
ToolConcurrency::Interactive
|
||||
}
|
||||
|
||||
async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> {
|
||||
let question = input["question"].as_str()
|
||||
.ok_or_else(|| ZclawError::InvalidInput("Missing 'question' parameter".into()))?;
|
||||
|
||||
@@ -4,7 +4,7 @@ use async_trait::async_trait;
|
||||
use serde_json::{json, Value};
|
||||
use zclaw_types::{Result, ZclawError};
|
||||
|
||||
use crate::tool::{Tool, ToolContext};
|
||||
use crate::tool::{Tool, ToolContext, ToolConcurrency};
|
||||
|
||||
pub struct ExecuteSkillTool;
|
||||
|
||||
@@ -42,6 +42,10 @@ impl Tool for ExecuteSkillTool {
|
||||
})
|
||||
}
|
||||
|
||||
fn concurrency(&self) -> ToolConcurrency {
|
||||
ToolConcurrency::Exclusive
|
||||
}
|
||||
|
||||
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> {
|
||||
let skill_id = input["skill_id"].as_str()
|
||||
.ok_or_else(|| ZclawError::InvalidInput("Missing 'skill_id' parameter".into()))?;
|
||||
|
||||
@@ -6,7 +6,7 @@ use zclaw_types::{Result, ZclawError};
|
||||
use std::fs;
|
||||
use std::io::Write;
|
||||
|
||||
use crate::tool::{Tool, ToolContext};
|
||||
use crate::tool::{Tool, ToolContext, ToolConcurrency};
|
||||
use super::path_validator::PathValidator;
|
||||
|
||||
pub struct FileWriteTool;
|
||||
@@ -55,6 +55,10 @@ impl Tool for FileWriteTool {
|
||||
})
|
||||
}
|
||||
|
||||
fn concurrency(&self) -> ToolConcurrency {
|
||||
ToolConcurrency::Exclusive
|
||||
}
|
||||
|
||||
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> {
|
||||
let path = input["path"].as_str()
|
||||
.ok_or_else(|| ZclawError::InvalidInput("Missing 'path' parameter".into()))?;
|
||||
|
||||
@@ -8,7 +8,7 @@ use serde_json::Value;
|
||||
use std::sync::Arc;
|
||||
use zclaw_types::Result;
|
||||
|
||||
use crate::tool::{Tool, ToolContext};
|
||||
use crate::tool::{Tool, ToolContext, ToolConcurrency};
|
||||
|
||||
/// Wraps an MCP tool adapter into the `Tool` trait.
|
||||
///
|
||||
@@ -42,6 +42,10 @@ impl Tool for McpToolWrapper {
|
||||
self.adapter.input_schema().clone()
|
||||
}
|
||||
|
||||
fn concurrency(&self) -> ToolConcurrency {
|
||||
ToolConcurrency::Exclusive
|
||||
}
|
||||
|
||||
async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> {
|
||||
self.adapter.execute(input).await
|
||||
}
|
||||
|
||||
@@ -97,6 +97,17 @@ fn default_blocked_paths() -> Vec<PathBuf> {
|
||||
]
|
||||
}
|
||||
|
||||
/// Normalize Windows UNC path prefix for consistent comparison.
|
||||
/// `\\?\C:\Users\...` → `C:\Users\...`
|
||||
fn normalize_windows_path(path: &Path) -> std::borrow::Cow<'_, Path> {
|
||||
let s = path.to_string_lossy();
|
||||
if s.starts_with(r"\\?\") {
|
||||
std::borrow::Cow::Owned(PathBuf::from(&s[4..]))
|
||||
} else {
|
||||
std::borrow::Cow::Borrowed(path)
|
||||
}
|
||||
}
|
||||
|
||||
/// Expand tilde in path to home directory
|
||||
fn expand_tilde(path: &str) -> PathBuf {
|
||||
if path.starts_with('~') {
|
||||
@@ -154,9 +165,16 @@ impl PathValidator {
|
||||
}
|
||||
}
|
||||
|
||||
/// Set the workspace root directory
|
||||
/// Set the workspace root directory.
|
||||
/// Canonicalizes the path to ensure consistent comparison on Windows
|
||||
/// (where canonicalize() returns `\\?\C:\...` UNC paths).
|
||||
pub fn with_workspace(mut self, workspace: PathBuf) -> Self {
|
||||
self.workspace_root = Some(workspace);
|
||||
let canonical = if workspace.exists() {
|
||||
workspace.canonicalize().unwrap_or(workspace)
|
||||
} else {
|
||||
workspace
|
||||
};
|
||||
self.workspace_root = Some(canonical);
|
||||
self
|
||||
}
|
||||
|
||||
@@ -230,7 +248,14 @@ impl PathValidator {
|
||||
fn resolve_and_validate(&self, path: &str) -> Result<PathBuf> {
|
||||
// Expand tilde
|
||||
let expanded = expand_tilde(path);
|
||||
let path_buf = PathBuf::from(&expanded);
|
||||
let mut path_buf = PathBuf::from(&expanded);
|
||||
|
||||
// If relative path and workspace is configured, resolve against workspace
|
||||
if path_buf.is_relative() {
|
||||
if let Some(ref workspace) = self.workspace_root {
|
||||
path_buf = workspace.join(&path_buf);
|
||||
}
|
||||
}
|
||||
|
||||
// Check for path traversal
|
||||
self.check_path_traversal(&path_buf)?;
|
||||
@@ -280,10 +305,14 @@ impl PathValidator {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Check if path is in blocked list
|
||||
/// Check if path is in blocked list.
|
||||
/// Normalizes Windows UNC prefix (`\\?\`) for consistent comparison.
|
||||
fn check_blocked(&self, path: &Path) -> Result<()> {
|
||||
// Strip Windows UNC prefix for consistent matching
|
||||
let normalized = normalize_windows_path(path);
|
||||
for blocked in &self.config.blocked_paths {
|
||||
if path.starts_with(blocked) || path == blocked {
|
||||
let blocked_norm = normalize_windows_path(blocked);
|
||||
if normalized.starts_with(&*blocked_norm) || normalized == blocked_norm {
|
||||
return Err(ZclawError::InvalidInput(format!(
|
||||
"Access to this path is blocked: {}",
|
||||
path.display()
|
||||
@@ -303,11 +332,15 @@ impl PathValidator {
|
||||
/// - This prevents accidental exposure of the entire filesystem
|
||||
/// when the validator is misconfigured or used without setup
|
||||
fn check_allowed(&self, path: &Path) -> Result<()> {
|
||||
let path_norm = normalize_windows_path(path);
|
||||
|
||||
// If no allowed paths specified, check workspace
|
||||
if self.config.allowed_paths.is_empty() {
|
||||
if let Some(ref workspace) = self.workspace_root {
|
||||
// Workspace is configured - validate path is within it
|
||||
if !path.starts_with(workspace) {
|
||||
// Both sides are canonicalized (workspace via with_workspace, path via resolve_and_validate)
|
||||
let ws_norm = normalize_windows_path(workspace);
|
||||
if !path_norm.starts_with(&*ws_norm) {
|
||||
return Err(ZclawError::InvalidInput(format!(
|
||||
"Path outside workspace: {} (workspace: {})",
|
||||
path.display(),
|
||||
@@ -329,7 +362,8 @@ impl PathValidator {
|
||||
|
||||
// Check against allowed paths
|
||||
for allowed in &self.config.allowed_paths {
|
||||
if path.starts_with(allowed) {
|
||||
let allowed_norm = normalize_windows_path(allowed);
|
||||
if path_norm.starts_with(&*allowed_norm) {
|
||||
return Ok(());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -8,7 +8,7 @@ use std::process::{Command, Stdio};
|
||||
use std::time::{Duration, Instant};
|
||||
use zclaw_types::{Result, ZclawError};
|
||||
|
||||
use crate::tool::{Tool, ToolContext};
|
||||
use crate::tool::{Tool, ToolContext, ToolConcurrency};
|
||||
|
||||
/// Parse a command string into program and arguments using proper shell quoting
|
||||
fn parse_command(command: &str) -> Result<(String, Vec<String>)> {
|
||||
@@ -175,6 +175,10 @@ impl Tool for ShellExecTool {
|
||||
})
|
||||
}
|
||||
|
||||
fn concurrency(&self) -> ToolConcurrency {
|
||||
ToolConcurrency::Exclusive
|
||||
}
|
||||
|
||||
async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> {
|
||||
let command = input["command"].as_str()
|
||||
.ok_or_else(|| ZclawError::InvalidInput("Missing 'command' parameter".into()))?;
|
||||
|
||||
@@ -11,7 +11,7 @@ use zclaw_memory::MemoryStore;
|
||||
|
||||
use crate::driver::LlmDriver;
|
||||
use crate::loop_runner::{AgentLoop, LoopEvent};
|
||||
use crate::tool::{Tool, ToolContext, ToolRegistry};
|
||||
use crate::tool::{Tool, ToolContext, ToolRegistry, ToolConcurrency};
|
||||
use crate::tool::builtin::register_builtin_tools;
|
||||
use std::sync::Arc;
|
||||
|
||||
@@ -91,6 +91,10 @@ impl Tool for TaskTool {
|
||||
})
|
||||
}
|
||||
|
||||
fn concurrency(&self) -> ToolConcurrency {
|
||||
ToolConcurrency::Exclusive
|
||||
}
|
||||
|
||||
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> {
|
||||
let description = input["description"].as_str()
|
||||
.ok_or_else(|| ZclawError::InvalidInput("Missing 'description' parameter".into()))?;
|
||||
|
||||
@@ -7,7 +7,7 @@ use async_trait::async_trait;
|
||||
use serde_json::{json, Value};
|
||||
use zclaw_types::Result;
|
||||
|
||||
use crate::tool::{Tool, ToolContext};
|
||||
use crate::tool::{Tool, ToolContext, ToolConcurrency};
|
||||
|
||||
/// Wrapper that exposes a Hand as a Tool in the agent's tool registry.
|
||||
///
|
||||
@@ -78,6 +78,10 @@ impl Tool for HandTool {
|
||||
self.input_schema.clone()
|
||||
}
|
||||
|
||||
fn concurrency(&self) -> ToolConcurrency {
|
||||
ToolConcurrency::Exclusive
|
||||
}
|
||||
|
||||
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> {
|
||||
// Delegate to the HandExecutor (bridged from HandRegistry via kernel).
|
||||
// If no hand_executor is available (e.g., standalone runtime without kernel),
|
||||
|
||||
@@ -223,6 +223,33 @@ impl Serialize for ZclawError {
|
||||
/// Result type alias for ZCLAW operations
|
||||
pub type Result<T> = std::result::Result<T, ZclawError>;
|
||||
|
||||
/// LLM 调用错误的细粒度分类,指导重试和恢复策略
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case")]
|
||||
pub enum LlmErrorKind {
|
||||
Auth,
|
||||
AuthPermanent,
|
||||
BillingExhausted,
|
||||
RateLimited,
|
||||
Overloaded,
|
||||
ServerError,
|
||||
Timeout,
|
||||
ContextOverflow,
|
||||
ModelNotFound,
|
||||
Unknown,
|
||||
}
|
||||
|
||||
/// 分类后的 LLM 错误,附带恢复提示
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct ClassifiedLlmError {
|
||||
pub kind: LlmErrorKind,
|
||||
pub retryable: bool,
|
||||
pub should_compress: bool,
|
||||
pub should_rotate_credential: bool,
|
||||
pub retry_after: Option<std::time::Duration>,
|
||||
pub message: String,
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
@@ -16,6 +16,21 @@ use zclaw_types::Result;
|
||||
use super::pain_aggregator::PainPoint;
|
||||
use super::solution_generator::Proposal;
|
||||
|
||||
/// Brief summary of a stored experience, for suggestion context enrichment.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct ExperienceBrief {
|
||||
pub pain_pattern: String,
|
||||
pub solution_summary: String,
|
||||
pub reuse_count: u32,
|
||||
}
|
||||
|
||||
static EXPERIENCE_EXTRACTOR: std::sync::OnceLock<std::sync::Arc<ExperienceExtractor>> = std::sync::OnceLock::new();
|
||||
|
||||
/// Get the global ExperienceExtractor singleton (if initialized).
|
||||
pub(crate) fn get_experience_extractor() -> Option<std::sync::Arc<ExperienceExtractor>> {
|
||||
EXPERIENCE_EXTRACTOR.get().cloned()
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Shared completion status
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -263,6 +278,36 @@ fn xml_escape(s: &str) -> String {
|
||||
.replace('>', ">")
|
||||
}
|
||||
|
||||
/// Initialize the global ExperienceExtractor singleton.
|
||||
/// Called once during app startup, after viking storage is ready.
|
||||
pub async fn init_experience_extractor() -> Result<()> {
|
||||
let sqlite_storage = crate::viking_commands::get_storage().await
|
||||
.map_err(|e| zclaw_types::ZclawError::StorageError(e))?;
|
||||
let viking = std::sync::Arc::new(zclaw_growth::VikingAdapter::new(sqlite_storage));
|
||||
let store = std::sync::Arc::new(ExperienceStore::new(viking));
|
||||
let extractor = std::sync::Arc::new(ExperienceExtractor::new(store));
|
||||
EXPERIENCE_EXTRACTOR.set(extractor)
|
||||
.map_err(|_| zclaw_types::ZclawError::StorageError("ExperienceExtractor already initialized".into()))?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Find experiences relevant to the current conversation for suggestion enrichment.
|
||||
#[tauri::command]
|
||||
pub async fn experience_find_relevant(
|
||||
agent_id: String,
|
||||
query: String,
|
||||
) -> std::result::Result<Vec<ExperienceBrief>, String> {
|
||||
let extractor = get_experience_extractor()
|
||||
.ok_or("ExperienceExtractor not initialized".to_string())?;
|
||||
let experiences = extractor.find_relevant_experiences(&agent_id, &query).await;
|
||||
Ok(experiences.into_iter().take(3).map(|e| ExperienceBrief {
|
||||
pain_pattern: e.pain_pattern,
|
||||
solution_summary: e.solution_steps.join(";")
|
||||
.chars().take(100).collect(),
|
||||
reuse_count: e.reuse_count,
|
||||
}).collect())
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Tests
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -407,4 +452,17 @@ mod tests {
|
||||
assert_eq!(truncate("hello", 10), "hello");
|
||||
assert_eq!(truncate("这是一个很长的字符串用于测试截断", 10).chars().count(), 11); // 10 + …
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_experience_brief_serialization() {
|
||||
let brief = super::ExperienceBrief {
|
||||
pain_pattern: "报表生成慢".to_string(),
|
||||
solution_summary: "使用 researcher 技能自动收集".to_string(),
|
||||
reuse_count: 3,
|
||||
};
|
||||
let json = serde_json::to_string(&brief).unwrap();
|
||||
let parsed: super::ExperienceBrief = serde_json::from_str(&json).unwrap();
|
||||
assert_eq!(parsed.pain_pattern, "报表生成慢");
|
||||
assert_eq!(parsed.reuse_count, 3);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -7,8 +7,10 @@
|
||||
|
||||
use tracing::{debug, warn};
|
||||
|
||||
use std::collections::HashMap;
|
||||
use std::sync::Arc;
|
||||
use tauri::Emitter;
|
||||
use tokio::sync::RwLock;
|
||||
use zclaw_growth::VikingStorage;
|
||||
|
||||
use crate::intelligence::identity::IdentityManagerState;
|
||||
@@ -16,6 +18,36 @@ use crate::intelligence::heartbeat::HeartbeatEngineState;
|
||||
use crate::intelligence::reflection::{MemoryEntryForAnalysis, ReflectionEngineState};
|
||||
use zclaw_runtime::driver::LlmDriver;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Identity prompt cache — avoids mutex + disk I/O on every request
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
struct CachedIdentity {
|
||||
prompt: String,
|
||||
#[allow(dead_code)] // Reserved for future TTL-based cache validation
|
||||
soul_hash: u64,
|
||||
}
|
||||
|
||||
static IDENTITY_CACHE: std::sync::LazyLock<RwLock<HashMap<String, CachedIdentity>>> =
|
||||
std::sync::LazyLock::new(|| RwLock::new(HashMap::new()));
|
||||
|
||||
/// Invalidate cached identity prompt for a given agent (call when soul.md changes).
|
||||
pub fn invalidate_identity_cache(agent_id: &str) {
|
||||
let cache = &*IDENTITY_CACHE;
|
||||
// Non-blocking: spawn a task to remove the entry
|
||||
if let Ok(mut guard) = cache.try_write() {
|
||||
guard.remove(agent_id);
|
||||
}
|
||||
}
|
||||
|
||||
/// Simple hash for cache invalidation — uses string content hash.
|
||||
fn content_hash(s: &str) -> u64 {
|
||||
use std::hash::{Hash, Hasher};
|
||||
let mut hasher = std::collections::hash_map::DefaultHasher::new();
|
||||
s.hash(&mut hasher);
|
||||
hasher.finish()
|
||||
}
|
||||
|
||||
/// Run pre-conversation intelligence hooks
|
||||
///
|
||||
/// Builds identity-enhanced system prompt (SOUL.md + instructions) and
|
||||
@@ -29,10 +61,29 @@ pub async fn pre_conversation_hook(
|
||||
_user_message: &str,
|
||||
identity_state: &IdentityManagerState,
|
||||
) -> Result<String, String> {
|
||||
// Build identity-enhanced system prompt (SOUL.md + instructions)
|
||||
// Memory context is injected by MemoryMiddleware in the kernel middleware chain,
|
||||
// not here, to avoid duplicate injection.
|
||||
let enhanced_prompt = match build_identity_prompt(agent_id, "", identity_state).await {
|
||||
// Check identity prompt cache first (avoids mutex + disk I/O)
|
||||
let cache = &*IDENTITY_CACHE;
|
||||
{
|
||||
let guard = cache.read().await;
|
||||
if let Some(cached) = guard.get(agent_id) {
|
||||
// Cache hit — still need continuity context, but skip identity build
|
||||
let continuity_context = build_continuity_context(agent_id, _user_message).await;
|
||||
let mut result = cached.prompt.clone();
|
||||
if !continuity_context.is_empty() {
|
||||
result.push_str(&continuity_context);
|
||||
}
|
||||
debug!("[intelligence_hooks] Identity cache HIT for agent {}", agent_id);
|
||||
return Ok(result);
|
||||
}
|
||||
}
|
||||
|
||||
// Cache miss — build identity prompt and continuity context in parallel
|
||||
let (identity_result, continuity_context) = tokio::join!(
|
||||
build_identity_prompt_cached(agent_id, "", identity_state, cache),
|
||||
build_continuity_context(agent_id, _user_message)
|
||||
);
|
||||
|
||||
let enhanced_prompt = match identity_result {
|
||||
Ok(prompt) => prompt,
|
||||
Err(e) => {
|
||||
warn!(
|
||||
@@ -43,9 +94,6 @@ pub async fn pre_conversation_hook(
|
||||
}
|
||||
};
|
||||
|
||||
// Cross-session continuity: check for unresolved pain points and recent experiences
|
||||
let continuity_context = build_continuity_context(agent_id, _user_message).await;
|
||||
|
||||
let mut result = enhanced_prompt;
|
||||
if !continuity_context.is_empty() {
|
||||
result.push_str(&continuity_context);
|
||||
@@ -240,6 +288,8 @@ pub async fn post_conversation_hook(
|
||||
warn!("[intelligence_hooks] Failed to update soul with agent name: {}", e);
|
||||
} else {
|
||||
debug!("[intelligence_hooks] Updated agent name to '{}' in soul", name);
|
||||
// Invalidate cache since soul.md changed
|
||||
invalidate_identity_cache(agent_id);
|
||||
}
|
||||
}
|
||||
drop(manager);
|
||||
@@ -340,21 +390,34 @@ async fn build_memory_context(
|
||||
Ok(context)
|
||||
}
|
||||
|
||||
/// Build identity-enhanced system prompt
|
||||
async fn build_identity_prompt(
|
||||
/// Build identity-enhanced system prompt and cache the result.
|
||||
async fn build_identity_prompt_cached(
|
||||
agent_id: &str,
|
||||
memory_context: &str,
|
||||
identity_state: &IdentityManagerState,
|
||||
cache: &RwLock<HashMap<String, CachedIdentity>>,
|
||||
) -> Result<String, String> {
|
||||
// IdentityManagerState is Arc<tokio::sync::Mutex<AgentIdentityManager>>
|
||||
// tokio::sync::Mutex::lock() returns MutexGuard directly
|
||||
let mut manager = identity_state.lock().await;
|
||||
|
||||
// Read current soul content for hashing
|
||||
let soul_content = manager.get_file(agent_id, crate::intelligence::identity::IdentityFile::Soul);
|
||||
let soul_hash = content_hash(&soul_content);
|
||||
|
||||
let prompt = manager.build_system_prompt(
|
||||
agent_id,
|
||||
if memory_context.is_empty() { None } else { Some(memory_context) },
|
||||
).await;
|
||||
|
||||
// Cache the result
|
||||
drop(manager); // Release lock before acquiring write guard
|
||||
{
|
||||
let mut guard = cache.write().await;
|
||||
guard.insert(agent_id.to_string(), CachedIdentity {
|
||||
prompt: prompt.clone(),
|
||||
soul_hash,
|
||||
});
|
||||
}
|
||||
|
||||
Ok(prompt)
|
||||
}
|
||||
|
||||
|
||||
@@ -212,6 +212,12 @@ pub fn run() {
|
||||
if let Err(e) = rt.block_on(intelligence::pain_aggregator::init_pain_storage(pool)) {
|
||||
tracing::error!("[PainStorage] Init failed: {}, pain points will not persist", e);
|
||||
}
|
||||
|
||||
// Initialize experience extractor for suggestion enrichment.
|
||||
// Graceful degradation: failure does not block app startup.
|
||||
if let Err(e) = rt.block_on(intelligence::experience::init_experience_extractor()) {
|
||||
tracing::warn!("[ExperienceExtractor] Init failed: {}, suggestion context will be empty", e);
|
||||
}
|
||||
}
|
||||
|
||||
Ok(())
|
||||
@@ -435,6 +441,8 @@ pub fn run() {
|
||||
intelligence::pain_aggregator::butler_update_proposal_status,
|
||||
// Industry config loader
|
||||
viking_commands::viking_load_industry_keywords,
|
||||
// Experience finder for suggestion enrichment
|
||||
intelligence::experience::experience_find_relevant,
|
||||
])
|
||||
.run(tauri::generate_context!())
|
||||
.expect("error while running tauri application");
|
||||
|
||||
@@ -665,6 +665,28 @@ function stripToolNarration(content: string): string {
|
||||
return result || content;
|
||||
}
|
||||
|
||||
/**
|
||||
* Strip dangling clarification references from text when ask_clarification tool was called.
|
||||
* When the LLM calls ask_clarification, it often ends its text with phrases like
|
||||
* "比如:" / "以下信息" / "以下选项" that reference the tool output — but the tool output
|
||||
* is rendered in a separate ClarificationCard, so these become confusing dead-end sentences.
|
||||
*/
|
||||
function stripDanglingClarificationRef(text: string, hasClarificationTool: boolean): string {
|
||||
if (!hasClarificationTool || !text) return text;
|
||||
// Match trailing dangling references in Chinese and English
|
||||
const patterns = [
|
||||
/[,,]\s*可以(?:提供以下|告诉我更多细节,)?(?:信息|选项|方向|细节|分类|类型)[::]\s*$/,
|
||||
/[,,]\s*比如[::]\s*$/,
|
||||
/[,,]\s*(?:例如|譬如|如以下)[::]\s*$/,
|
||||
/,\s*(?:for example|such as|like|the following)[::]?\s*$/i,
|
||||
];
|
||||
for (const pat of patterns) {
|
||||
const stripped = text.replace(pat, '');
|
||||
if (stripped !== text) return stripped;
|
||||
}
|
||||
return text;
|
||||
}
|
||||
|
||||
function MessageBubble({ message, onRetry }: { message: Message; setInput?: (text: string) => void; onRetry?: () => void }) {
|
||||
if (message.role === 'tool') {
|
||||
return null;
|
||||
@@ -749,7 +771,10 @@ function MessageBubble({ message, onRetry }: { message: Message; setInput?: (tex
|
||||
? (isUser
|
||||
? message.content
|
||||
: <StreamingText
|
||||
content={stripToolNarration(message.content)}
|
||||
content={stripDanglingClarificationRef(
|
||||
stripToolNarration(message.content),
|
||||
toolCallSteps?.some(s => s.toolName === 'ask_clarification') ?? false,
|
||||
)}
|
||||
isStreaming={!!message.streaming}
|
||||
className="text-gray-700 dark:text-gray-200"
|
||||
/>
|
||||
|
||||
@@ -6,9 +6,10 @@ import {
|
||||
Image as ImageIcon,
|
||||
Download,
|
||||
Copy,
|
||||
ChevronLeft,
|
||||
ChevronDown,
|
||||
File,
|
||||
} from 'lucide-react';
|
||||
import { MarkdownRenderer } from './MarkdownRenderer';
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Types
|
||||
@@ -76,6 +77,7 @@ export function ArtifactPanel({
|
||||
className = '',
|
||||
}: ArtifactPanelProps) {
|
||||
const [viewMode, setViewMode] = useState<'preview' | 'code'>('preview');
|
||||
const [fileMenuOpen, setFileMenuOpen] = useState(false);
|
||||
const selected = useMemo(
|
||||
() => artifacts.find((a) => a.id === selectedId),
|
||||
[artifacts, selectedId]
|
||||
@@ -135,22 +137,59 @@ export function ArtifactPanel({
|
||||
|
||||
return (
|
||||
<div className={`h-full flex flex-col ${className}`}>
|
||||
{/* File header */}
|
||||
{/* File header with inline file selector */}
|
||||
<div className="px-4 py-2 border-b border-gray-200 dark:border-gray-700 flex items-center gap-2 flex-shrink-0">
|
||||
<button
|
||||
onClick={() => onSelect('')}
|
||||
className="p-1 rounded hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-400 hover:text-gray-600 dark:hover:text-gray-200 transition-colors"
|
||||
title="返回文件列表"
|
||||
>
|
||||
<ChevronLeft className="w-4 h-4" />
|
||||
</button>
|
||||
<Icon className="w-4 h-4 text-orange-500 flex-shrink-0" />
|
||||
<span className="text-sm font-medium text-gray-700 dark:text-gray-200 truncate flex-1">
|
||||
{selected.name}
|
||||
</span>
|
||||
<div className="relative">
|
||||
<button
|
||||
onClick={() => setFileMenuOpen(!fileMenuOpen)}
|
||||
className="flex items-center gap-1.5 text-sm font-medium text-gray-700 dark:text-gray-200 truncate hover:text-orange-500 transition-colors"
|
||||
title="切换文件"
|
||||
>
|
||||
<Icon className="w-4 h-4 text-orange-500 flex-shrink-0" />
|
||||
<span className="truncate max-w-[120px]">{selected.name}</span>
|
||||
{artifacts.length > 1 && (
|
||||
<ChevronDown className={`w-3.5 h-3.5 text-gray-400 transition-transform ${fileMenuOpen ? 'rotate-180' : ''}`} />
|
||||
)}
|
||||
</button>
|
||||
|
||||
{/* File selector dropdown */}
|
||||
{fileMenuOpen && artifacts.length > 1 && (
|
||||
<>
|
||||
<div className="fixed inset-0 z-10" onClick={() => setFileMenuOpen(false)} />
|
||||
<div className="absolute top-full left-0 mt-1 w-56 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg z-20 py-1 max-h-60 overflow-y-auto">
|
||||
{artifacts.map((artifact) => {
|
||||
const ItemIcon = getFileIcon(artifact.type);
|
||||
return (
|
||||
<button
|
||||
key={artifact.id}
|
||||
onClick={() => { onSelect(artifact.id); setFileMenuOpen(false); }}
|
||||
className={`w-full flex items-center gap-2 px-3 py-2 text-left text-sm hover:bg-gray-50 dark:hover:bg-gray-700 transition-colors ${
|
||||
artifact.id === selected.id ? 'bg-orange-50 dark:bg-orange-900/20 text-orange-700 dark:text-orange-300' : 'text-gray-700 dark:text-gray-200'
|
||||
}`}
|
||||
>
|
||||
<ItemIcon className="w-4 h-4 flex-shrink-0" />
|
||||
<span className="truncate flex-1">{artifact.name}</span>
|
||||
<span className={`text-[10px] px-1 py-0.5 rounded ${getTypeColor(artifact.type)}`}>
|
||||
{getTypeLabel(artifact.type)}
|
||||
</span>
|
||||
</button>
|
||||
);
|
||||
})}
|
||||
</div>
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
|
||||
<div className="flex-1" />
|
||||
|
||||
<span className={`text-[10px] px-1.5 py-0.5 rounded font-medium ${getTypeColor(selected.type)}`}>
|
||||
{getTypeLabel(selected.type)}
|
||||
</span>
|
||||
{selected.language && (
|
||||
<span className="text-[10px] text-gray-400 dark:text-gray-500">
|
||||
{selected.language}
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* View mode toggle */}
|
||||
@@ -180,19 +219,7 @@ export function ArtifactPanel({
|
||||
{/* Content area */}
|
||||
<div className="flex-1 overflow-y-auto custom-scrollbar p-4">
|
||||
{viewMode === 'preview' ? (
|
||||
<div className="prose prose-sm dark:prose-invert max-w-none">
|
||||
{selected.type === 'markdown' ? (
|
||||
<MarkdownPreview content={selected.content} />
|
||||
) : selected.type === 'code' ? (
|
||||
<pre className="bg-gray-50 dark:bg-gray-800 rounded-lg p-3 text-xs font-mono overflow-x-auto text-gray-700 dark:text-gray-200">
|
||||
{selected.content}
|
||||
</pre>
|
||||
) : (
|
||||
<pre className="whitespace-pre-wrap text-sm text-gray-700 dark:text-gray-200">
|
||||
{selected.content}
|
||||
</pre>
|
||||
)}
|
||||
</div>
|
||||
<ArtifactContentPreview artifact={selected} />
|
||||
) : (
|
||||
<pre className="bg-gray-50 dark:bg-gray-800 rounded-lg p-3 text-xs font-mono overflow-x-auto text-gray-700 dark:text-gray-200 leading-relaxed">
|
||||
{selected.content}
|
||||
@@ -217,6 +244,37 @@ export function ArtifactPanel({
|
||||
);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// ArtifactContentPreview — renders artifact based on type
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function ArtifactContentPreview({ artifact }: { artifact: ArtifactFile }) {
|
||||
if (artifact.type === 'markdown') {
|
||||
return <MarkdownRenderer content={artifact.content} />;
|
||||
}
|
||||
|
||||
if (artifact.type === 'code') {
|
||||
return (
|
||||
<div className="relative">
|
||||
{artifact.language && (
|
||||
<div className="absolute top-2 right-2 text-[10px] text-gray-400 dark:text-gray-500 bg-gray-100 dark:bg-gray-700 px-1.5 py-0.5 rounded">
|
||||
{artifact.language}
|
||||
</div>
|
||||
)}
|
||||
<pre className="bg-gray-50 dark:bg-gray-900 rounded-lg p-4 text-xs font-mono overflow-x-auto text-gray-700 dark:text-gray-200 leading-relaxed border border-gray-200 dark:border-gray-700">
|
||||
{artifact.content}
|
||||
</pre>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<pre className="whitespace-pre-wrap text-sm text-gray-700 dark:text-gray-200">
|
||||
{artifact.content}
|
||||
</pre>
|
||||
);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// ActionButton
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -243,50 +301,6 @@ function ActionButton({ icon, label, onClick }: { icon: React.ReactNode; label:
|
||||
);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Simple Markdown preview (no external deps)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function MarkdownPreview({ content }: { content: string }) {
|
||||
// Basic markdown rendering: headings, bold, code blocks, lists
|
||||
const lines = content.split('\n');
|
||||
|
||||
return (
|
||||
<div className="space-y-2">
|
||||
{lines.map((line, i) => {
|
||||
// Heading
|
||||
if (line.startsWith('### ')) {
|
||||
return <h3 key={i} className="text-sm font-bold text-gray-800 dark:text-gray-100 mt-3">{line.slice(4)}</h3>;
|
||||
}
|
||||
if (line.startsWith('## ')) {
|
||||
return <h2 key={i} className="text-base font-bold text-gray-800 dark:text-gray-100 mt-4">{line.slice(3)}</h2>;
|
||||
}
|
||||
if (line.startsWith('# ')) {
|
||||
return <h1 key={i} className="text-lg font-bold text-gray-800 dark:text-gray-100">{line.slice(2)}</h1>;
|
||||
}
|
||||
// Code block (simplified)
|
||||
if (line.startsWith('```')) return null;
|
||||
// List item
|
||||
if (line.startsWith('- ') || line.startsWith('* ')) {
|
||||
return <li key={i} className="text-sm text-gray-700 dark:text-gray-300 ml-4">{renderInline(line.slice(2))}</li>;
|
||||
}
|
||||
// Empty line
|
||||
if (!line.trim()) return <div key={i} className="h-2" />;
|
||||
// Regular paragraph
|
||||
return <p key={i} className="text-sm text-gray-700 dark:text-gray-300 leading-relaxed">{renderInline(line)}</p>;
|
||||
})}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function renderInline(text: string): React.ReactNode {
|
||||
// Bold
|
||||
const parts = text.split(/\*\*(.*?)\*\*/g);
|
||||
return parts.map((part, i) =>
|
||||
i % 2 === 1 ? <strong key={i} className="font-semibold">{part}</strong> : part
|
||||
);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Download helper
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
123
desktop/src/components/ai/MarkdownRenderer.tsx
Normal file
123
desktop/src/components/ai/MarkdownRenderer.tsx
Normal file
@@ -0,0 +1,123 @@
|
||||
/**
|
||||
* MarkdownRenderer — shared Markdown rendering with styled components.
|
||||
*
|
||||
* Extracted from StreamingText.tsx so ArtifactPanel and other consumers
|
||||
* can reuse the same rich rendering (GFM tables, syntax blocks, etc.)
|
||||
* without duplicating the component overrides.
|
||||
*/
|
||||
|
||||
import ReactMarkdown from 'react-markdown';
|
||||
import remarkGfm from 'remark-gfm';
|
||||
import type { Components } from 'react-markdown';
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Shared component overrides for react-markdown
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const markdownComponents: Components = {
|
||||
pre({ children }) {
|
||||
return (
|
||||
<pre className="bg-gray-50 dark:bg-gray-900 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed border border-gray-200 dark:border-gray-700 my-3">
|
||||
{children}
|
||||
</pre>
|
||||
);
|
||||
},
|
||||
code({ className, children, ...props }) {
|
||||
const isBlock = className?.startsWith('language-');
|
||||
if (isBlock) {
|
||||
return (
|
||||
<code className={`${className || ''} text-gray-800 dark:text-gray-200`} {...props}>
|
||||
{children}
|
||||
</code>
|
||||
);
|
||||
}
|
||||
return (
|
||||
<code className="bg-gray-100 dark:bg-gray-800 text-gray-700 dark:text-gray-300 px-1.5 py-0.5 rounded text-[0.9em] font-mono" {...props}>
|
||||
{children}
|
||||
</code>
|
||||
);
|
||||
},
|
||||
table({ children }) {
|
||||
return (
|
||||
<div className="overflow-x-auto my-3 -mx-1">
|
||||
<table className="min-w-full border-collapse border border-gray-200 dark:border-gray-700 rounded-lg text-sm">
|
||||
{children}
|
||||
</table>
|
||||
</div>
|
||||
);
|
||||
},
|
||||
thead({ children }) {
|
||||
return <thead className="bg-gray-50 dark:bg-gray-800/50">{children}</thead>;
|
||||
},
|
||||
th({ children }) {
|
||||
return (
|
||||
<th className="border border-gray-200 dark:border-gray-700 px-3 py-2 text-left font-semibold text-gray-700 dark:text-gray-300">
|
||||
{children}
|
||||
</th>
|
||||
);
|
||||
},
|
||||
td({ children }) {
|
||||
return (
|
||||
<td className="border border-gray-200 dark:border-gray-700 px-3 py-2 text-gray-600 dark:text-gray-400">
|
||||
{children}
|
||||
</td>
|
||||
);
|
||||
},
|
||||
ul({ children }) {
|
||||
return <ul className="list-disc list-outside ml-5 my-2 space-y-1">{children}</ul>;
|
||||
},
|
||||
ol({ children }) {
|
||||
return <ol className="list-decimal list-outside ml-5 my-2 space-y-1">{children}</ol>;
|
||||
},
|
||||
li({ children }) {
|
||||
return <li className="leading-relaxed">{children}</li>;
|
||||
},
|
||||
h1({ children }) {
|
||||
return <h1 className="text-xl font-bold mt-5 mb-3 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h1>;
|
||||
},
|
||||
h2({ children }) {
|
||||
return <h2 className="text-lg font-bold mt-4 mb-2 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h2>;
|
||||
},
|
||||
h3({ children }) {
|
||||
return <h3 className="text-base font-semibold mt-3 mb-2 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h3>;
|
||||
},
|
||||
blockquote({ children }) {
|
||||
return (
|
||||
<blockquote className="border-l-4 border-gray-300 dark:border-gray-600 pl-4 py-1 my-3 text-gray-600 dark:text-gray-400 italic bg-gray-50 dark:bg-gray-800/30 rounded-r-lg">
|
||||
{children}
|
||||
</blockquote>
|
||||
);
|
||||
},
|
||||
p({ children }) {
|
||||
return <p className="my-2 leading-relaxed first:mt-0 last:mb-0">{children}</p>;
|
||||
},
|
||||
a({ href, children }) {
|
||||
return (
|
||||
<a href={href} target="_blank" rel="noopener noreferrer" className="text-blue-600 dark:text-blue-400 underline hover:text-blue-800 dark:hover:text-blue-300">
|
||||
{children}
|
||||
</a>
|
||||
);
|
||||
},
|
||||
hr() {
|
||||
return <hr className="my-4 border-gray-200 dark:border-gray-700" />;
|
||||
},
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Convenience wrapper
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
interface MarkdownRendererProps {
|
||||
content: string;
|
||||
className?: string;
|
||||
}
|
||||
|
||||
export function MarkdownRenderer({ content, className = '' }: MarkdownRendererProps) {
|
||||
return (
|
||||
<div className={`prose-sm prose-gray dark:prose-invert max-w-none ${className}`}>
|
||||
<ReactMarkdown remarkPlugins={[remarkGfm]} components={markdownComponents}>
|
||||
{content}
|
||||
</ReactMarkdown>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
@@ -1,7 +1,5 @@
|
||||
import { useMemo, useRef, useEffect, useState } from 'react';
|
||||
import ReactMarkdown from 'react-markdown';
|
||||
import remarkGfm from 'remark-gfm';
|
||||
import type { Components } from 'react-markdown';
|
||||
import { MarkdownRenderer } from './MarkdownRenderer';
|
||||
|
||||
/**
|
||||
* Streaming text with word-by-word reveal animation.
|
||||
@@ -18,111 +16,6 @@ interface StreamingTextProps {
|
||||
asMarkdown?: boolean;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Markdown component overrides for rich rendering
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const markdownComponents: Components = {
|
||||
// Code blocks (```...```)
|
||||
pre({ children }) {
|
||||
return (
|
||||
<pre className="bg-gray-50 dark:bg-gray-900 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed border border-gray-200 dark:border-gray-700 my-3">
|
||||
{children}
|
||||
</pre>
|
||||
);
|
||||
},
|
||||
// Inline code (`...`)
|
||||
code({ className, children, ...props }) {
|
||||
// If it has a language class, it's inside a code block — render as block
|
||||
const isBlock = className?.startsWith('language-');
|
||||
if (isBlock) {
|
||||
return (
|
||||
<code className={`${className || ''} text-gray-800 dark:text-gray-200`} {...props}>
|
||||
{children}
|
||||
</code>
|
||||
);
|
||||
}
|
||||
return (
|
||||
<code className="bg-gray-100 dark:bg-gray-800 text-gray-700 dark:text-gray-300 px-1.5 py-0.5 rounded text-[0.9em] font-mono" {...props}>
|
||||
{children}
|
||||
</code>
|
||||
);
|
||||
},
|
||||
// Tables
|
||||
table({ children }) {
|
||||
return (
|
||||
<div className="overflow-x-auto my-3 -mx-1">
|
||||
<table className="min-w-full border-collapse border border-gray-200 dark:border-gray-700 rounded-lg text-sm">
|
||||
{children}
|
||||
</table>
|
||||
</div>
|
||||
);
|
||||
},
|
||||
thead({ children }) {
|
||||
return <thead className="bg-gray-50 dark:bg-gray-800/50">{children}</thead>;
|
||||
},
|
||||
th({ children }) {
|
||||
return (
|
||||
<th className="border border-gray-200 dark:border-gray-700 px-3 py-2 text-left font-semibold text-gray-700 dark:text-gray-300">
|
||||
{children}
|
||||
</th>
|
||||
);
|
||||
},
|
||||
td({ children }) {
|
||||
return (
|
||||
<td className="border border-gray-200 dark:border-gray-700 px-3 py-2 text-gray-600 dark:text-gray-400">
|
||||
{children}
|
||||
</td>
|
||||
);
|
||||
},
|
||||
// Unordered lists
|
||||
ul({ children }) {
|
||||
return <ul className="list-disc list-outside ml-5 my-2 space-y-1">{children}</ul>;
|
||||
},
|
||||
// Ordered lists
|
||||
ol({ children }) {
|
||||
return <ol className="list-decimal list-outside ml-5 my-2 space-y-1">{children}</ol>;
|
||||
},
|
||||
// List items
|
||||
li({ children }) {
|
||||
return <li className="leading-relaxed">{children}</li>;
|
||||
},
|
||||
// Headings
|
||||
h1({ children }) {
|
||||
return <h1 className="text-xl font-bold mt-5 mb-3 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h1>;
|
||||
},
|
||||
h2({ children }) {
|
||||
return <h2 className="text-lg font-bold mt-4 mb-2 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h2>;
|
||||
},
|
||||
h3({ children }) {
|
||||
return <h3 className="text-base font-semibold mt-3 mb-2 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h3>;
|
||||
},
|
||||
// Blockquotes
|
||||
blockquote({ children }) {
|
||||
return (
|
||||
<blockquote className="border-l-4 border-gray-300 dark:border-gray-600 pl-4 py-1 my-3 text-gray-600 dark:text-gray-400 italic bg-gray-50 dark:bg-gray-800/30 rounded-r-lg">
|
||||
{children}
|
||||
</blockquote>
|
||||
);
|
||||
},
|
||||
// Paragraphs
|
||||
p({ children }) {
|
||||
return <p className="my-2 leading-relaxed first:mt-0 last:mb-0">{children}</p>;
|
||||
},
|
||||
// Links
|
||||
a({ href, children }) {
|
||||
return (
|
||||
<a href={href} target="_blank" rel="noopener noreferrer" className="text-blue-600 dark:text-blue-400 underline hover:text-blue-800 dark:hover:text-blue-300">
|
||||
{children}
|
||||
</a>
|
||||
);
|
||||
},
|
||||
// Horizontal rules
|
||||
hr() {
|
||||
return <hr className="my-4 border-gray-200 dark:border-gray-700" />;
|
||||
},
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Token splitter for streaming animation
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -176,13 +69,7 @@ export function StreamingText({
|
||||
}: StreamingTextProps) {
|
||||
// For completed messages, use full markdown rendering with styled components
|
||||
if (!isStreaming && asMarkdown) {
|
||||
return (
|
||||
<div className={`prose-sm prose-gray dark:prose-invert max-w-none ${className}`}>
|
||||
<ReactMarkdown remarkPlugins={[remarkGfm]} components={markdownComponents}>
|
||||
{content}
|
||||
</ReactMarkdown>
|
||||
</div>
|
||||
);
|
||||
return <MarkdownRenderer content={content} className={className} />;
|
||||
}
|
||||
|
||||
// For streaming messages, use token-by-token animation
|
||||
|
||||
@@ -166,7 +166,8 @@ interface ToolStepRowProps {
|
||||
}
|
||||
|
||||
function ToolStepRow({ step, isActive, showConnector }: ToolStepRowProps) {
|
||||
const [expanded, setExpanded] = useState(false);
|
||||
// Clarification cards default to expanded so users see options immediately
|
||||
const [expanded, setExpanded] = useState(step.toolName === 'ask_clarification');
|
||||
const Icon = getToolIcon(step.toolName);
|
||||
const label = getToolLabel(step.toolName);
|
||||
const isRunning = step.status === 'running';
|
||||
|
||||
@@ -8,4 +8,5 @@ export { SuggestionChips } from './SuggestionChips';
|
||||
export { ResizableChatLayout } from './ResizableChatLayout';
|
||||
export { ToolCallChain, type ToolCallStep } from './ToolCallChain';
|
||||
export { ArtifactPanel, type ArtifactFile } from './ArtifactPanel';
|
||||
export { MarkdownRenderer, markdownComponents } from './MarkdownRenderer';
|
||||
export { TokenMeter } from './TokenMeter';
|
||||
|
||||
@@ -696,13 +696,14 @@ export class GatewayClient {
|
||||
break;
|
||||
|
||||
case 'tool_call':
|
||||
// Tool call event
|
||||
// Tool call start: onTool(name, input, '') — empty output signals start
|
||||
if (callbacks.onTool && data.tool) {
|
||||
callbacks.onTool(data.tool, JSON.stringify(data.input || {}), data.output || '');
|
||||
callbacks.onTool(data.tool, JSON.stringify(data.input || {}), '');
|
||||
}
|
||||
break;
|
||||
|
||||
case 'tool_result':
|
||||
// Tool call end: onTool(name, '', output) — empty input signals end
|
||||
if (callbacks.onTool && data.tool) {
|
||||
callbacks.onTool(data.tool, '', String(data.result || data.output || ''));
|
||||
}
|
||||
|
||||
@@ -646,18 +646,25 @@ const HARDCODED_PROMPTS: Record<string, { system: string; user: (arg: string) =>
|
||||
},
|
||||
|
||||
suggestions: {
|
||||
system: `你是对话分析助手。根据最近的对话内容,生成 3 个用户可能想继续探讨的问题。
|
||||
system: `你是 ZCLAW 的管家助手,需要站在用户角度思考他们真正需要什么,生成 3 个个性化建议。
|
||||
|
||||
要求:
|
||||
- 每个问题必须与对话内容直接相关,具体且有针对性
|
||||
- 帮助用户深入理解、实际操作或拓展思路
|
||||
- 每个问题不超过 30 个中文字符
|
||||
- 不要重复对话中已讨论过的内容
|
||||
- 使用与用户相同的语言
|
||||
## 生成规则
|
||||
1. 第 1 条 — 深入追问:基于当前话题,提出一个有洞察力的追问,帮助用户深入探索
|
||||
2. 第 2 条 — 实用行动:建议一个具体的、可操作的下一步(调用技能、执行工具、查看数据等)
|
||||
3. 第 3 条 — 管家关怀:
|
||||
- 如果有未解决痛点 → 回访建议,如"上次提到的X,后来解决了吗?"
|
||||
- 如果有相关经验 → 引导复用,如"上次用X方法解决了类似问题,要再试试吗?"
|
||||
- 如果有匹配技能 → 推荐使用,如"试试 [技能名] 来处理这个"
|
||||
- 如果没有提供痛点/经验/技能信息 → 给出一个启发性的思考角度
|
||||
4. 每个不超过 30 个中文字符
|
||||
5. 不要重复对话中已讨论过的内容
|
||||
6. 不要生成空泛的建议(如"继续分析"、"换个角度")
|
||||
7. 默认使用中文,不要混入英文词汇(如"workflow"用"工作流"、"report"用"报表"),除非用户在对话中明确使用英文
|
||||
8. 建议会被用户直接点击发送,因此不要包含任何称谓(如"领导"、"老板"、"老师"等),用无主语的问句或陈述句
|
||||
|
||||
只输出 JSON 数组,包含恰好 3 个字符串。不要输出任何其他内容。
|
||||
示例:["如何在生产环境中部署?", "这个方案的成本如何?", "有没有更简单的替代方案?"]`,
|
||||
user: (context: string) => `以下是对话中最近的消息:\n\n${context}\n\n请生成 3 个后续问题。`,
|
||||
示例:["科室绩效分析可以按哪些维度拆解?", "用研究技能查一下相关文献?", "上次提到的排班冲突问题,需要继续想解决方案吗?"]`,
|
||||
user: (context: string) => `以下是对话中最近的消息:\n\n${context}\n\n请生成 3 个后续建议(1 深入追问 + 1 实用行动 + 1 管家关怀)。`,
|
||||
},
|
||||
};
|
||||
|
||||
|
||||
131
desktop/src/lib/suggestion-context.ts
Normal file
131
desktop/src/lib/suggestion-context.ts
Normal file
@@ -0,0 +1,131 @@
|
||||
/**
|
||||
* Suggestion context enrichment — fetches intelligence data for personalized suggestions.
|
||||
* All fetches are optional; failures silently degrade to empty context.
|
||||
*/
|
||||
|
||||
import { invoke } from '@tauri-apps/api/core';
|
||||
import { createLogger } from './logger';
|
||||
|
||||
const log = createLogger('SuggestionContext');
|
||||
|
||||
const CONTEXT_FETCH_TIMEOUT = 500;
|
||||
|
||||
/** Pain point from butler intelligence layer. */
|
||||
interface PainPoint {
|
||||
summary: string;
|
||||
category: string;
|
||||
confidence: number;
|
||||
status: string;
|
||||
occurrence_count: number;
|
||||
}
|
||||
|
||||
/** Brief experience from the experience store. */
|
||||
interface ExperienceBrief {
|
||||
pain_pattern: string;
|
||||
solution_summary: string;
|
||||
reuse_count: number;
|
||||
}
|
||||
|
||||
/** Pipeline/skill match candidate. */
|
||||
interface PipelineCandidateInfo {
|
||||
id: string;
|
||||
display_name: string;
|
||||
description: string;
|
||||
category: string | null;
|
||||
match_reason: string | null;
|
||||
}
|
||||
|
||||
/** Route intent response (only NoMatch variant has suggestions). */
|
||||
interface RouteResultResponse {
|
||||
type: 'Matched' | 'Ambiguous' | 'NoMatch' | 'NeedMoreInfo';
|
||||
suggestions?: PipelineCandidateInfo[];
|
||||
}
|
||||
|
||||
/** Aggregated suggestion context from all intelligence sources. */
|
||||
export interface SuggestionContext {
|
||||
userProfile: string;
|
||||
painPoints: string;
|
||||
experiences: string;
|
||||
skillMatch: string;
|
||||
}
|
||||
|
||||
function isTauriAvailable(): boolean {
|
||||
return typeof window !== 'undefined' && '__TAURI_INTERNALS__' in window;
|
||||
}
|
||||
|
||||
function withTimeout<T>(promise: Promise<T>, ms: number): Promise<T | null> {
|
||||
return Promise.race([
|
||||
promise,
|
||||
new Promise<null>(resolve => setTimeout(() => resolve(null), ms)),
|
||||
]);
|
||||
}
|
||||
|
||||
async function fetchUserProfile(agentId: string): Promise<string> {
|
||||
const profile = await invoke<string>('identity_get_file', {
|
||||
agentId,
|
||||
file: 'userprofile',
|
||||
});
|
||||
if (!profile || profile.trim().length === 0) return '';
|
||||
const text = profile.trim();
|
||||
return text.length > 200 ? text.slice(0, 200) : text;
|
||||
}
|
||||
|
||||
async function fetchPainPoints(agentId: string): Promise<string> {
|
||||
const points = await invoke<PainPoint[]>('butler_list_pain_points', { agentId });
|
||||
if (!Array.isArray(points) || points.length === 0) return '';
|
||||
|
||||
const active = points
|
||||
.filter(p => p.confidence >= 0.5 && p.status !== 'Solved' && p.status !== 'Dismissed')
|
||||
.sort((a, b) => b.confidence - a.confidence)
|
||||
.slice(0, 3);
|
||||
|
||||
if (active.length === 0) return '';
|
||||
return active
|
||||
.map((p, i) => `${i + 1}. [${p.category}] ${p.summary}(出现${p.occurrence_count}次)`)
|
||||
.join('\n');
|
||||
}
|
||||
|
||||
async function fetchExperiences(agentId: string, query: string): Promise<string> {
|
||||
const experiences = await invoke<ExperienceBrief[]>('experience_find_relevant', {
|
||||
agentId,
|
||||
query,
|
||||
});
|
||||
if (!Array.isArray(experiences) || experiences.length === 0) return '';
|
||||
|
||||
return experiences.slice(0, 2)
|
||||
.map(e => `上次解决"${e.pain_pattern}"的方法:${e.solution_summary}(已复用${e.reuse_count}次)`)
|
||||
.join('\n');
|
||||
}
|
||||
|
||||
async function fetchSkillMatch(userInput: string): Promise<string> {
|
||||
const result = await invoke<RouteResultResponse>('route_intent', { userInput });
|
||||
const suggestions = result?.suggestions;
|
||||
if (!Array.isArray(suggestions) || suggestions.length === 0) return '';
|
||||
|
||||
const best = suggestions[0];
|
||||
return `你可能需要:${best.display_name} — ${best.description}`;
|
||||
}
|
||||
|
||||
const EMPTY_CONTEXT: SuggestionContext = { userProfile: '', painPoints: '', experiences: '', skillMatch: '' };
|
||||
|
||||
/**
|
||||
* Fetch all intelligence context in parallel for suggestion enrichment.
|
||||
* Returns empty strings for any source that fails — never throws.
|
||||
*/
|
||||
export async function fetchSuggestionContext(
|
||||
agentId: string,
|
||||
lastUserMessage: string,
|
||||
): Promise<SuggestionContext> {
|
||||
if (!isTauriAvailable()) {
|
||||
return EMPTY_CONTEXT;
|
||||
}
|
||||
|
||||
const [userProfile, painPoints, experiences, skillMatch] = await Promise.all([
|
||||
withTimeout(fetchUserProfile(agentId).catch(e => { log.warn('User profile fetch failed:', e); return ''; }), CONTEXT_FETCH_TIMEOUT),
|
||||
withTimeout(fetchPainPoints(agentId).catch(e => { log.warn('Pain points fetch failed:', e); return ''; }), CONTEXT_FETCH_TIMEOUT),
|
||||
withTimeout(fetchExperiences(agentId, lastUserMessage).catch(e => { log.warn('Experiences fetch failed:', e); return ''; }), CONTEXT_FETCH_TIMEOUT),
|
||||
withTimeout(fetchSkillMatch(lastUserMessage).catch(e => { log.warn('Skill match fetch failed:', e); return ''; }), CONTEXT_FETCH_TIMEOUT),
|
||||
]);
|
||||
|
||||
return { userProfile: userProfile ?? '', painPoints: painPoints ?? '', experiences: experiences ?? '', skillMatch: skillMatch ?? '' };
|
||||
}
|
||||
@@ -1,13 +1,13 @@
|
||||
/**
|
||||
* ArtifactStore — manages the artifact panel state.
|
||||
* ArtifactStore — manages the artifact panel state with IndexedDB persistence.
|
||||
*
|
||||
* Extracted from chatStore.ts as part of the structured refactor.
|
||||
* This store has zero external dependencies — the simplest slice to extract.
|
||||
*
|
||||
* @see docs/superpowers/specs/2026-04-02-chatstore-refactor-design.md §3.5
|
||||
* Uses zustand/middleware persist + idb-storage for persistence across refreshes.
|
||||
*/
|
||||
|
||||
import { create } from 'zustand';
|
||||
import { persist, createJSONStorage } from 'zustand/middleware';
|
||||
import { createIdbStorageAdapter } from '../../lib/idb-storage';
|
||||
import type { ArtifactFile } from '../../components/ai/ArtifactPanel';
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -33,22 +33,33 @@ export interface ArtifactState {
|
||||
// Store
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const useArtifactStore = create<ArtifactState>()((set) => ({
|
||||
artifacts: [],
|
||||
selectedArtifactId: null,
|
||||
artifactPanelOpen: false,
|
||||
export const useArtifactStore = create<ArtifactState>()(
|
||||
persist(
|
||||
(set) => ({
|
||||
artifacts: [],
|
||||
selectedArtifactId: null,
|
||||
artifactPanelOpen: false,
|
||||
|
||||
addArtifact: (artifact: ArtifactFile) =>
|
||||
set((state) => ({
|
||||
artifacts: [...state.artifacts, artifact],
|
||||
selectedArtifactId: artifact.id,
|
||||
artifactPanelOpen: true,
|
||||
})),
|
||||
addArtifact: (artifact: ArtifactFile) =>
|
||||
set((state) => ({
|
||||
artifacts: [...state.artifacts, artifact],
|
||||
selectedArtifactId: artifact.id,
|
||||
artifactPanelOpen: true,
|
||||
})),
|
||||
|
||||
selectArtifact: (id: string | null) => set({ selectedArtifactId: id }),
|
||||
selectArtifact: (id: string | null) => set({ selectedArtifactId: id }),
|
||||
|
||||
setArtifactPanelOpen: (open: boolean) => set({ artifactPanelOpen: open }),
|
||||
setArtifactPanelOpen: (open: boolean) => set({ artifactPanelOpen: open }),
|
||||
|
||||
clearArtifacts: () =>
|
||||
set({ artifacts: [], selectedArtifactId: null, artifactPanelOpen: false }),
|
||||
}));
|
||||
clearArtifacts: () =>
|
||||
set({ artifacts: [], selectedArtifactId: null, artifactPanelOpen: false }),
|
||||
}),
|
||||
{
|
||||
name: 'zclaw-artifact-storage',
|
||||
storage: createJSONStorage(() => createIdbStorageAdapter()),
|
||||
partialize: (state) => ({
|
||||
artifacts: state.artifacts,
|
||||
}),
|
||||
},
|
||||
),
|
||||
);
|
||||
|
||||
@@ -34,11 +34,16 @@ import {
|
||||
} from './conversationStore';
|
||||
import { useMessageStore } from './messageStore';
|
||||
import { useArtifactStore } from './artifactStore';
|
||||
import { llmSuggest } from '../../lib/llm-service';
|
||||
import { llmSuggest, LLM_PROMPTS } from '../../lib/llm-service';
|
||||
import { detectNameSuggestion, detectAgentNameSuggestion } from '../../lib/cold-start-mapper';
|
||||
import { fetchSuggestionContext, type SuggestionContext } from '../../lib/suggestion-context';
|
||||
|
||||
const log = createLogger('StreamStore');
|
||||
|
||||
// Module-level prefetch for suggestion context — started during streaming,
|
||||
// consumed on stream completion. Saves ~0.5-1s vs fetching after stream ends.
|
||||
let _activeSuggestionContextPrefetch: Promise<SuggestionContext> | null = null;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Error formatting — convert raw LLM/API errors to user-friendly messages
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -214,6 +219,67 @@ class DeltaBuffer {
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Artifact creation from tool output (shared between sendMessage & agent stream)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const ARTIFACT_TYPE_MAP: Record<string, 'code' | 'markdown' | 'text' | 'table' | 'image'> = {
|
||||
ts: 'code', tsx: 'code', js: 'code', jsx: 'code',
|
||||
py: 'code', rs: 'code', go: 'code', java: 'code',
|
||||
md: 'markdown', txt: 'text', json: 'code',
|
||||
html: 'code', css: 'code', sql: 'code', sh: 'code',
|
||||
yaml: 'code', yml: 'code', toml: 'code', xml: 'code',
|
||||
csv: 'table', svg: 'image',
|
||||
};
|
||||
|
||||
const ARTIFACT_LANG_MAP: Record<string, string> = {
|
||||
ts: 'typescript', tsx: 'typescript', js: 'javascript', jsx: 'javascript',
|
||||
py: 'python', rs: 'rust', go: 'go', java: 'java',
|
||||
html: 'html', css: 'css', sql: 'sql', sh: 'bash',
|
||||
json: 'json', yaml: 'yaml', yml: 'yaml', toml: 'toml',
|
||||
xml: 'xml', csv: 'csv', md: 'markdown', txt: 'text',
|
||||
};
|
||||
|
||||
/** Attempt to create an artifact from a completed tool call. */
|
||||
function tryCreateArtifactFromToolOutput(toolName: string, toolInput: string, toolOutput: string): void {
|
||||
if (!toolOutput) return;
|
||||
|
||||
const toolsWithArtifacts = ['file_write', 'write_file', 'str_replace', 'str_replace_editor'];
|
||||
if (!toolsWithArtifacts.includes(toolName)) return;
|
||||
|
||||
try {
|
||||
const parsed = JSON.parse(toolOutput);
|
||||
const filePath = parsed?.path || parsed?.file_path || '';
|
||||
let content = parsed?.content || '';
|
||||
|
||||
// For str_replace tools, content may be in input
|
||||
if (!content && toolInput) {
|
||||
try {
|
||||
const inputParsed = JSON.parse(toolInput);
|
||||
content = inputParsed?.new_text || inputParsed?.content || '';
|
||||
} catch { /* ignore */ }
|
||||
}
|
||||
|
||||
if (!filePath || !content) return;
|
||||
|
||||
// Deduplicate: skip if an artifact with the same path already exists
|
||||
const existing = useArtifactStore.getState().artifacts;
|
||||
if (existing.some(a => a.name === filePath.split('/').pop())) return;
|
||||
|
||||
const fileName = filePath.split('/').pop() || filePath;
|
||||
const ext = fileName.split('.').pop()?.toLowerCase() || '';
|
||||
|
||||
useArtifactStore.getState().addArtifact({
|
||||
id: `artifact_${Date.now()}`,
|
||||
name: fileName,
|
||||
content: typeof content === 'string' ? content : JSON.stringify(content, null, 2),
|
||||
type: ARTIFACT_TYPE_MAP[ext] || 'text',
|
||||
language: ARTIFACT_LANG_MAP[ext],
|
||||
createdAt: new Date(),
|
||||
});
|
||||
} catch { /* non-critical: artifact creation from tool output */ }
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Stream event handlers (extracted from sendMessage)
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -236,38 +302,8 @@ function createToolHandler(assistantId: string, chat: ChatStoreAccess) {
|
||||
})
|
||||
);
|
||||
|
||||
// Auto-create artifact when file_write tool produces output
|
||||
if (tool === 'file_write') {
|
||||
try {
|
||||
const parsed = JSON.parse(output);
|
||||
const filePath = parsed?.path || parsed?.file_path || '';
|
||||
const content = parsed?.content || '';
|
||||
if (filePath && content) {
|
||||
const fileName = filePath.split('/').pop() || filePath;
|
||||
const ext = fileName.split('.').pop()?.toLowerCase() || '';
|
||||
const typeMap: Record<string, 'code' | 'markdown' | 'text'> = {
|
||||
ts: 'code', tsx: 'code', js: 'code', jsx: 'code',
|
||||
py: 'code', rs: 'code', go: 'code', java: 'code',
|
||||
md: 'markdown', txt: 'text', json: 'code',
|
||||
html: 'code', css: 'code', sql: 'code', sh: 'code',
|
||||
};
|
||||
const langMap: Record<string, string> = {
|
||||
ts: 'typescript', tsx: 'typescript', js: 'javascript', jsx: 'javascript',
|
||||
py: 'python', rs: 'rust', go: 'go', java: 'java',
|
||||
html: 'html', css: 'css', sql: 'sql', sh: 'bash', json: 'json',
|
||||
};
|
||||
useArtifactStore.getState().addArtifact({
|
||||
id: `artifact_${Date.now()}`,
|
||||
name: fileName,
|
||||
content: typeof content === 'string' ? content : JSON.stringify(content, null, 2),
|
||||
type: typeMap[ext] || 'text',
|
||||
language: langMap[ext],
|
||||
createdAt: new Date(),
|
||||
sourceStepId: assistantId,
|
||||
});
|
||||
}
|
||||
} catch { /* non-critical: artifact creation from tool output */ }
|
||||
}
|
||||
// Auto-create artifact from tool output
|
||||
tryCreateArtifactFromToolOutput(tool, input, output);
|
||||
} else {
|
||||
// toolStart: create new running step
|
||||
const step: ToolCallStep = {
|
||||
@@ -399,36 +435,50 @@ function createCompleteHandler(
|
||||
}
|
||||
}
|
||||
|
||||
// Async memory extraction (independent — failures don't block name detection)
|
||||
// Decoupled: suggestion generation runs immediately with prefetched context,
|
||||
// memory extraction + reflection run independently in background.
|
||||
const filtered = msgs
|
||||
.filter(m => m.role === 'user' || m.role === 'assistant')
|
||||
.map(m => ({ role: m.role, content: m.content }));
|
||||
const convId = useConversationStore.getState().currentConversationId;
|
||||
getMemoryExtractor().extractFromConversation(filtered, agentId, convId ?? undefined)
|
||||
.catch(err => log.warn('Memory extraction failed:', err));
|
||||
|
||||
intelligenceClient.reflection.recordConversation().catch(err => {
|
||||
log.warn('Recording conversation failed:', err);
|
||||
});
|
||||
intelligenceClient.reflection.shouldReflect().then(shouldReflect => {
|
||||
if (shouldReflect) {
|
||||
intelligenceClient.reflection.reflect(agentId, []).catch(err => {
|
||||
log.warn('Reflection failed:', err);
|
||||
});
|
||||
}
|
||||
});
|
||||
|
||||
// Follow-up suggestions (async LLM call with keyword fallback)
|
||||
// Build conversation messages for suggestions
|
||||
const latestMsgs = chat.getMessages() || [];
|
||||
const conversationMessages = latestMsgs
|
||||
.filter(m => m.role === 'user' || m.role === 'assistant')
|
||||
.filter(m => !m.streaming)
|
||||
.map(m => ({ role: m.role, content: m.content }));
|
||||
|
||||
generateLLMSuggestions(conversationMessages, set).catch(err => {
|
||||
log.warn('Suggestion generation error:', err);
|
||||
set({ suggestionsLoading: false });
|
||||
});
|
||||
// Consume prefetched context (started in sendMessage during streaming)
|
||||
const prefetchPromise = _activeSuggestionContextPrefetch;
|
||||
_activeSuggestionContextPrefetch = null;
|
||||
|
||||
// Fire suggestion generation immediately — don't wait for memory extraction
|
||||
const fireSuggestions = (ctx?: SuggestionContext) => {
|
||||
generateLLMSuggestions(conversationMessages, set, ctx).catch(err => {
|
||||
log.warn('Suggestion generation error:', err);
|
||||
set({ suggestionsLoading: false });
|
||||
});
|
||||
};
|
||||
if (prefetchPromise) {
|
||||
prefetchPromise.then(fireSuggestions).catch(() => fireSuggestions());
|
||||
} else {
|
||||
fireSuggestions();
|
||||
}
|
||||
|
||||
// Background tasks run independently — never block suggestions
|
||||
getMemoryExtractor().extractFromConversation(filtered, agentId, convId ?? undefined)
|
||||
.catch(err => log.warn('Memory extraction failed:', err));
|
||||
intelligenceClient.reflection.recordConversation()
|
||||
.catch(err => log.warn('Recording conversation failed:', err))
|
||||
.then(() => intelligenceClient.reflection.shouldReflect())
|
||||
.then(shouldReflect => {
|
||||
if (shouldReflect) {
|
||||
intelligenceClient.reflection.reflect(agentId, []).catch(err => {
|
||||
log.warn('Reflection failed:', err);
|
||||
});
|
||||
}
|
||||
}).catch(() => {});
|
||||
};
|
||||
}
|
||||
|
||||
@@ -559,15 +609,32 @@ function parseSuggestionResponse(raw: string): string[] {
|
||||
async function generateLLMSuggestions(
|
||||
messages: Array<{ role: string; content: string }>,
|
||||
set: (partial: Partial<StreamState>) => void,
|
||||
context?: SuggestionContext,
|
||||
): Promise<void> {
|
||||
set({ suggestionsLoading: true });
|
||||
|
||||
try {
|
||||
const recentMessages = messages.slice(-6);
|
||||
const context = recentMessages
|
||||
.map(m => `${m.role === 'user' ? '用户' : '助手'}: ${m.content}`)
|
||||
const recentMessages = messages.slice(-20);
|
||||
const conversationContext = recentMessages
|
||||
.map(m => `${m.role === 'user' ? '用户' : '助手'}: ${m.content.slice(0, 200)}`)
|
||||
.join('\n\n');
|
||||
|
||||
// Build dynamic user message with intelligence context
|
||||
const ctx = context ?? { userProfile: '', painPoints: '', experiences: '', skillMatch: '' };
|
||||
const hasContext = ctx.userProfile || ctx.painPoints || ctx.experiences || ctx.skillMatch;
|
||||
let userMessage: string;
|
||||
if (hasContext) {
|
||||
const sections: string[] = ['以下是用户的背景信息,请在生成建议时参考:\n'];
|
||||
if (ctx.userProfile) sections.push(`## 用户画像\n${ctx.userProfile}`);
|
||||
if (ctx.painPoints) sections.push(`## 活跃痛点\n${ctx.painPoints}`);
|
||||
if (ctx.experiences) sections.push(`## 相关经验\n${ctx.experiences}`);
|
||||
if (ctx.skillMatch) sections.push(`## 可用技能\n${ctx.skillMatch}`);
|
||||
sections.push(`\n最近对话:\n${conversationContext}`);
|
||||
userMessage = sections.join('\n\n');
|
||||
} else {
|
||||
userMessage = `以下是对话中最近的消息:\n\n${conversationContext}\n\n请生成 3 个后续问题。`;
|
||||
}
|
||||
|
||||
const connectionMode = typeof localStorage !== 'undefined'
|
||||
? localStorage.getItem('zclaw-connection-mode')
|
||||
: null;
|
||||
@@ -575,9 +642,9 @@ async function generateLLMSuggestions(
|
||||
let raw: string;
|
||||
|
||||
if (connectionMode === 'saas') {
|
||||
raw = await llmSuggestViaSaaS(context);
|
||||
raw = await llmSuggestViaSaaS(userMessage);
|
||||
} else {
|
||||
raw = await llmSuggest(context);
|
||||
raw = await llmSuggest(userMessage);
|
||||
}
|
||||
|
||||
const suggestions = parseSuggestionResponse(raw);
|
||||
@@ -601,7 +668,7 @@ async function generateLLMSuggestions(
|
||||
* with non-streaming requests. Collects the full response from SSE deltas,
|
||||
* then parses the suggestion JSON from the accumulated text.
|
||||
*/
|
||||
async function llmSuggestViaSaaS(context: string): Promise<string> {
|
||||
async function llmSuggestViaSaaS(userMessage: string): Promise<string> {
|
||||
const { saasClient } = await import('../../lib/saas-client');
|
||||
const { useConversationStore } = await import('./conversationStore');
|
||||
const { useSaaSStore } = await import('../saasStore');
|
||||
@@ -611,9 +678,6 @@ async function llmSuggestViaSaaS(context: string): Promise<string> {
|
||||
const model = currentModel || (availableModels.length > 0 ? availableModels[0]?.id : undefined);
|
||||
if (!model) throw new Error('No model available for suggestions');
|
||||
|
||||
// Delay to avoid concurrent relay requests with memory extraction
|
||||
await new Promise(r => setTimeout(r, 2000));
|
||||
|
||||
const controller = new AbortController();
|
||||
const timeoutId = setTimeout(() => controller.abort(), 60000);
|
||||
|
||||
@@ -623,7 +687,7 @@ async function llmSuggestViaSaaS(context: string): Promise<string> {
|
||||
model,
|
||||
messages: [
|
||||
{ role: 'system', content: LLM_PROMPTS_SYSTEM },
|
||||
{ role: 'user', content: `以下是对话中最近的消息:\n\n${context}\n\n请生成 3 个后续问题。` },
|
||||
{ role: 'user', content: userMessage },
|
||||
],
|
||||
max_tokens: 500,
|
||||
temperature: 0.7,
|
||||
@@ -664,17 +728,7 @@ async function llmSuggestViaSaaS(context: string): Promise<string> {
|
||||
}
|
||||
}
|
||||
|
||||
const LLM_PROMPTS_SYSTEM = `你是对话分析助手。根据最近的对话内容,生成 3 个用户可能想继续探讨的问题。
|
||||
|
||||
要求:
|
||||
- 每个问题必须与对话内容直接相关,具体且有针对性
|
||||
- 帮助用户深入理解、实际操作或拓展思路
|
||||
- 每个问题不超过 30 个中文字符
|
||||
- 不要重复对话中已讨论过的内容
|
||||
- 使用与用户相同的语言
|
||||
|
||||
只输出 JSON 数组,包含恰好 3 个字符串。不要输出任何其他内容。
|
||||
示例:["如何在生产环境中部署?", "这个方案的成本如何?", "有没有更简单的替代方案?"]`;
|
||||
const LLM_PROMPTS_SYSTEM = LLM_PROMPTS.suggestions.system;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// ChatStore injection (avoids circular imports)
|
||||
@@ -786,6 +840,9 @@ export const useStreamStore = create<StreamState>()(
|
||||
});
|
||||
set({ isStreaming: true, activeRunId: null });
|
||||
|
||||
// Prefetch suggestion context during streaming — saves ~0.5-1s post-stream
|
||||
_activeSuggestionContextPrefetch = fetchSuggestionContext(agentId, content);
|
||||
|
||||
// Delta buffer — batches updates at ~60fps
|
||||
const buffer = new DeltaBuffer(assistantId, _chat);
|
||||
|
||||
@@ -1001,6 +1058,13 @@ export const useStreamStore = create<StreamState>()(
|
||||
return { ...m, toolSteps: steps };
|
||||
})
|
||||
);
|
||||
|
||||
// Auto-create artifact from tool output (agent stream path)
|
||||
tryCreateArtifactFromToolOutput(
|
||||
delta.tool || 'unknown',
|
||||
delta.toolInput || '',
|
||||
delta.toolOutput,
|
||||
);
|
||||
} else {
|
||||
// toolStart: create new running step
|
||||
const step: ToolCallStep = {
|
||||
@@ -1059,10 +1123,20 @@ export const useStreamStore = create<StreamState>()(
|
||||
.filter(m => !m.streaming)
|
||||
.map(m => ({ role: m.role, content: m.content }));
|
||||
|
||||
generateLLMSuggestions(conversationMessages, set).catch(err => {
|
||||
log.warn('Suggestion generation error:', err);
|
||||
set({ suggestionsLoading: false });
|
||||
});
|
||||
// Path B: use prefetched context for agent stream — fixes zero-personalization
|
||||
const prefetchPromise = _activeSuggestionContextPrefetch;
|
||||
_activeSuggestionContextPrefetch = null;
|
||||
const fireSuggestions = (ctx?: SuggestionContext) => {
|
||||
generateLLMSuggestions(conversationMessages, set, ctx).catch(err => {
|
||||
log.warn('Suggestion generation error:', err);
|
||||
set({ suggestionsLoading: false });
|
||||
});
|
||||
};
|
||||
if (prefetchPromise) {
|
||||
prefetchPromise.then(fireSuggestions).catch(() => fireSuggestions());
|
||||
} else {
|
||||
fireSuggestions();
|
||||
}
|
||||
}
|
||||
}
|
||||
} else if (delta.stream === 'hand') {
|
||||
|
||||
309
docs/references/artifact-system-reference.md
Normal file
309
docs/references/artifact-system-reference.md
Normal file
@@ -0,0 +1,309 @@
|
||||
# 产物系统参考文档
|
||||
|
||||
> 调研 DeerFlow 和 Hermes Agent 的产物/输出面板实现,为 ZCLAW 产物系统重构提供参考。
|
||||
> 分析日期:2026-04-24
|
||||
|
||||
---
|
||||
|
||||
## 一、DeerFlow 产物系统
|
||||
|
||||
DeerFlow 有完整的全栈产物管道,是主要参考对象。
|
||||
|
||||
### 1.1 端到端数据流
|
||||
|
||||
```
|
||||
Agent tool call (write_file / str_replace / present_files)
|
||||
↓
|
||||
Backend: ThreadState.artifacts (LangGraph annotated list, merge_artifacts reducer 去重)
|
||||
↓ 文件写入: {base_dir}/threads/{thread_id}/user-data/outputs/
|
||||
↓ 虚拟路径: /mnt/user-data/outputs/filename.ext
|
||||
↓
|
||||
Backend API: GET /api/threads/{thread_id}/artifacts/{virtual_path}
|
||||
↓ MIME 检测 / .skill ZIP 解压 / download vs inline
|
||||
↓
|
||||
Frontend: thread.values.artifacts (string[]) → ArtifactsProvider context
|
||||
↓
|
||||
ChatBox (ResizablePanelGroup) → chat(60%) | artifact panel(40%)
|
||||
↓
|
||||
ArtifactFileDetail → CodeMirror(代码) / Streamdown(Markdown) / iframe(HTML)
|
||||
```
|
||||
|
||||
### 1.2 关键文件
|
||||
|
||||
#### 前端核心
|
||||
|
||||
| 文件 | 职责 |
|
||||
|------|------|
|
||||
| `frontend/src/core/artifacts/utils.ts` | URL 构建、产物列表提取、路径解析 |
|
||||
| `frontend/src/core/artifacts/loader.ts` | 从后端 API 获取产物文本;从 tool call args 直接提取内容 |
|
||||
| `frontend/src/core/artifacts/hooks.ts` | TanStack React Query hook,5 分钟缓存 |
|
||||
| `frontend/src/components/workspace/artifacts/context.tsx` | ArtifactsProvider + useArtifacts() — 管理列表、选中、开关、自动选中 |
|
||||
| `frontend/src/components/workspace/artifacts/artifact-file-detail.tsx` | 产物详情视图:头部(文件选择器+code/preview切换) + CodeEditor/Preview |
|
||||
| `frontend/src/components/workspace/artifacts/artifact-file-list.tsx` | 卡片式列表视图,每个卡片含图标/名称/扩展名/下载/安装按钮 |
|
||||
| `frontend/src/components/workspace/artifacts/artifact-trigger.tsx` | 头部触发按钮,仅在产物存在时显示 |
|
||||
|
||||
#### 前端渲染
|
||||
|
||||
| 文件 | 职责 |
|
||||
|------|------|
|
||||
| `frontend/src/components/workspace/code-editor.tsx` | CodeMirror 只读编辑器,支持 CSS/HTML/JS/JSON/MD/Python 语法高亮 |
|
||||
| `frontend/src/components/ai-elements/code-block.tsx` | Shiki 语法高亮代码块,双主题(light/dark) |
|
||||
| `frontend/src/components/ai-elements/web-preview.tsx` | iframe 网页预览,含地址栏和导航按钮 |
|
||||
| `frontend/src/components/workspace/messages/markdown-content.tsx` | Streamdown 渲染 Markdown (GFM + Math + Raw HTML + KaTeX) |
|
||||
| `frontend/src/core/utils/files.tsx` | 140+ 扩展名→语言映射,文件图标/类型判断 |
|
||||
|
||||
#### 后端
|
||||
|
||||
| 文件 | 职责 |
|
||||
|------|------|
|
||||
| `backend/.../thread_state.py` | ThreadState.artifacts 列表 + merge_artifacts 去重 reducer |
|
||||
| `backend/.../present_file_tool.py` | present_files 工具 — 标准化路径,返回 Command(update) |
|
||||
| `backend/.../paths.py` | 路径管理:threads/{id}/user-data/{workspace,uploads,outputs} |
|
||||
| `backend/app/gateway/routers/artifacts.py` | FastAPI 路由:GET 产物文件,MIME 检测,安全处理 |
|
||||
|
||||
### 1.3 支持的内容类型
|
||||
|
||||
| 类型 | 渲染方式 |
|
||||
|------|----------|
|
||||
| 代码文件 (140+ 扩展名) | CodeMirror 只读 + 语法高亮 |
|
||||
| Markdown (.md) | Streamdown (GFM + Math + KaTeX + Raw HTML) |
|
||||
| HTML (.html/.htm) | 沙箱 `<iframe>` (srcDoc) |
|
||||
| 图片 (.png/.jpg/.svg/.webp) | `<img>` 标签,非代码文件用 iframe |
|
||||
| .skill 压缩包 | ZIP 解压,SKILL.md 渲染为 Markdown |
|
||||
| 二进制文件 (PDF 等) | 后端 inline Content-Disposition |
|
||||
| 文本文件 (.txt/.csv/.log) | CodeMirror 纯文本模式 |
|
||||
|
||||
### 1.4 持久化架构
|
||||
|
||||
**磁盘存储:**
|
||||
```
|
||||
{DEER_FLOW_HOME}/threads/{thread_id}/user-data/outputs/
|
||||
```
|
||||
|
||||
**状态持久化:** artifacts 列表是 LangGraph ThreadState 的一部分,由 checkpoint 系统自动持久化。
|
||||
|
||||
**前端缓存:** TanStack React Query,5 分钟 stale time。
|
||||
|
||||
### 1.5 UI/UX 设计模式
|
||||
|
||||
#### 分栏布局 (chat-box.tsx)
|
||||
- `react-resizable-panels` 水平分栏
|
||||
- 关闭态:chat=100%, artifacts=0%
|
||||
- 打开态:chat=60%, artifacts=40%
|
||||
- 300ms CSS 过渡动画
|
||||
|
||||
#### 自动打开 + 自动选中
|
||||
- 检测到 `write_file` / `str_replace` tool call 时自动打开面板并选中文件
|
||||
- `autoOpen` / `autoSelect` 标志防止用户手动关闭后重复打开
|
||||
|
||||
#### 代码/预览切换
|
||||
- HTML/Markdown 默认 Preview,其他默认 Code
|
||||
- Preview 用 Streamdown(MD) 或 iframe(HTML)
|
||||
|
||||
#### 头部操作栏
|
||||
- 文件选择器下拉菜单(不用返回列表即可切换)
|
||||
- 复制 / 下载 / 新窗口打开 / 关闭
|
||||
|
||||
#### 聊天内嵌展示
|
||||
- `present_files` tool call → 聊天流内渲染卡片网格
|
||||
- 点击卡片 → 侧栏打开该文件
|
||||
|
||||
#### 双路径方案
|
||||
1. **真实文件路径** — 从后端 API 获取,React Query 缓存
|
||||
2. **`write-file:` 虚拟路径** — 直接从 tool call args 提取内容,无需后端请求,支持流式显示
|
||||
|
||||
### 1.6 Provider 层级
|
||||
|
||||
```
|
||||
ArtifactsProvider → 提供useArtifacts() context
|
||||
ChatBox → ResizablePanelGroup
|
||||
Panel(chat) → MessageList → ToolCall 自动打开产物面板
|
||||
Panel(artifacts) → ArtifactFileDetail → useArtifactContent() hook
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 二、Hermes Agent 产物机制
|
||||
|
||||
> **结论:Hermes Agent 无产物面板、无 Web 前端、无分栏布局。** 它是终端 CLI 工具,所有输出在终端内联渲染。但有值得借鉴的大输出处理机制。
|
||||
|
||||
### 2.1 项目定位
|
||||
|
||||
Hermes Agent 是 **Python CLI/TUI Agent**(类似 Claude Code),通过 prompt_toolkit TUI 运行,同时支持 Telegram/Discord/Slack/WhatsApp 等 IM 平台网关。
|
||||
|
||||
**无 React/Next.js/Web UI。** 暴露 OpenAI 兼容 API 供 Open WebUI/LobeChat 等第三方 UI 接入。
|
||||
|
||||
### 2.2 大输出处理(3 层防御)
|
||||
|
||||
这是唯一接近"产物管理"的机制,值得借鉴。
|
||||
|
||||
**文件:`tools/tool_result_storage.py`**
|
||||
|
||||
| 层级 | 机制 | 说明 |
|
||||
|------|------|------|
|
||||
| Layer 1 | 工具自身截断 | 每个工具限制自己的输出长度 |
|
||||
| Layer 2 | `maybe_persist_tool_result` | 单个结果超阈值 → 写入沙箱临时文件,上下文中替换为 `<persisted-output>` 预览块 |
|
||||
| Layer 3 | `enforce_turn_budget` | 整轮超过 200K 字符 → 最大的几个溢出到磁盘 |
|
||||
|
||||
核心逻辑:
|
||||
```python
|
||||
# 超阈值时:完整内容写入文件,上下文替换为预览
|
||||
remote_path = f"{storage_dir}/{tool_use_id}.txt"
|
||||
_write_to_sandbox(content, remote_path, env)
|
||||
return _build_persisted_message(preview, has_more, len(content), remote_path)
|
||||
# 后续 agent 可用 read_file + offset/limit 读取完整内容
|
||||
```
|
||||
|
||||
### 2.3 预算配置
|
||||
|
||||
**文件:`tools/budget_config.py`**
|
||||
|
||||
| 参数 | 默认值 |
|
||||
|------|--------|
|
||||
| `DEFAULT_RESULT_SIZE_CHARS` | 100,000(单工具阈值)|
|
||||
| `DEFAULT_TURN_BUDGET_CHARS` | 200,000(整轮上限)|
|
||||
| `DEFAULT_PREVIEW_SIZE_CHARS` | 1,500(内联预览长度)|
|
||||
|
||||
### 2.4 CLI 渲染方式
|
||||
|
||||
**文件:`agent/display.py`**
|
||||
|
||||
- **工具进度**:KawaiiSpinner 动画 + 一行摘要
|
||||
- **文件编辑**:内联 colored unified diff(write_file / patch 工具)
|
||||
- **最终响应**:Rich Panel 边框包裹,主题色可换(7 套 skin)
|
||||
|
||||
### 2.5 会话持久化
|
||||
|
||||
**文件:`hermes_state.py`**
|
||||
|
||||
SQLite (`~/.hermes/state.db`) + FTS5 全文搜索:
|
||||
- sessions 表:元数据、模型配置、token 计数、费用、标题
|
||||
- messages 表:role、content、tool_call_id、reasoning、时间戳
|
||||
|
||||
### 2.6 值得借鉴的点
|
||||
|
||||
| 点 | 借鉴价值 |
|
||||
|----|----------|
|
||||
| 大输出溢出到磁盘 + 内联预览 | 解决 context window 溢出问题 |
|
||||
| 3 层递进防御 | 对 ZCLAW 中间件链有参考价值 |
|
||||
| 预算配置化 | 阈值可调,不同场景不同策略 |
|
||||
|
||||
---
|
||||
|
||||
## 三、对比分析:ZCLAW 现状 vs 参考方案
|
||||
|
||||
### 3.1 现状差距
|
||||
|
||||
| 维度 | DeerFlow | ZCLAW 现状 | 差距 |
|
||||
|------|----------|------------|------|
|
||||
| 数据源 | 3 个工具(present_files/write_file/str_replace)主动注册 | 仅 streamStore 解析 tool output 的 filePath | 极窄,几乎不触发 |
|
||||
| 持久化 | 磁盘文件 + LangGraph checkpoint | 纯内存 Zustand | 刷新即丢失 |
|
||||
| 渲染-代码 | CodeMirror 只读 + 语法高亮 (140+ 语言) | 纯 `<pre>` 标签,无高亮 | 无高亮 |
|
||||
| 渲染-Markdown | Streamdown (GFM+Math+KaTeX+RawHTML) | 手写 30 行正则渲染器 | 仅标题/粗体/列表 |
|
||||
| 渲染-HTML | 沙箱 iframe | 不支持 | 无 |
|
||||
| 渲染-图片 | `<img>` + iframe | 类型声明了无实现 | 无 |
|
||||
| 渲染-表格 | GFM 表格 | 纯文本 `<pre>` | 无 |
|
||||
| 面板布局 | react-resizable-panels 60/40 | react-resizable-panels 65/35 | 已有,可复用 |
|
||||
| 自动打开 | write_file/str_replace 触发 | addArtifact 时打开 | 已有 |
|
||||
| 文件选择 | 下拉菜单不离开详情视图 | 必须返回列表再选 | 体验差 |
|
||||
| 聊天内嵌 | present_files → 卡片网格 | 无 | 缺失 |
|
||||
| 缓存 | React Query 5min | 无 | 缺失 |
|
||||
| 双路径 | 真实路径 + write-file: 虚拟路径 | 仅运行时内存 | 缺失 |
|
||||
| 右面板重叠 | 单一面板 | ArtifactPanel + RightPanel"文件"tab 职责交叉 | 架构问题 |
|
||||
|
||||
### 3.2 核心差距总结
|
||||
|
||||
**按优先级排列:**
|
||||
|
||||
1. **P0 数据源断裂** — 产物几乎没有来源,是最根本的问题
|
||||
2. **P0 无持久化** — 产物刷新即丢
|
||||
3. **P1 Markdown 渲染残缺** — 30 行正则 vs 完整 GFM 渲染器
|
||||
4. **P1 代码无语法高亮** — 纯 `<pre>` vs CodeMirror/Shiki
|
||||
5. **P2 双面板职责交叉** — ArtifactPanel vs RightPanel"文件"tab
|
||||
6. **P2 缺少详情内文件切换** — 需返回列表才能切换文件
|
||||
7. **P3 聊天内嵌产物卡片缺失**
|
||||
8. **P3 HTML/图片/表格渲染缺失**
|
||||
|
||||
### 3.3 推荐方案
|
||||
|
||||
#### 方案 A:最小可行(基于现有架构补全)
|
||||
|
||||
在现有 ArtifactPanel + artifactStore 上修补:
|
||||
|
||||
- **数据源**:扩展 streamStore 中的 tool output 解析,覆盖更多工具类型
|
||||
- **持久化**:artifactStore 追加 IndexedDB 写入(复用 messageStore 模式)
|
||||
- **Markdown**:引入 `react-markdown` + `remark-gfm` 替换手写渲染器
|
||||
- **代码高亮**:引入 `shiki` 或 `highlight.js`
|
||||
- **合并面板**:RightPanel "文件"tab 功能合并到 ArtifactPanel,删除 RightPanel 的 files tab
|
||||
|
||||
**工作量**:~2-3 天
|
||||
|
||||
#### 方案 B:参照 DeerFlow 重构(推荐)
|
||||
|
||||
借鉴 DeerFlow 架构但适配 ZCLAW Tauri 本地架构:
|
||||
|
||||
| DeerFlow 组件 | ZCLAW 适配 |
|
||||
|---------------|------------|
|
||||
| FastAPI 产物路由 | Tauri 命令 `artifact_list` / `artifact_read` / `artifact_serve` |
|
||||
| 磁盘 outputs/ 目录 | `{workspace}/artifacts/{session_key}/` |
|
||||
| LangGraph checkpoint | SQLite (已有 zclaw-memory) |
|
||||
| React Query 缓存 | TanStack Query 或 Zustand + stale cache |
|
||||
| CodeMirror 只读 | 引入 @uiw/react-codemirror |
|
||||
| Streamdown MD | react-markdown + remark-gfm + rehype-katex |
|
||||
| iframe HTML 预览 | Tauri webview window (安全隔离) |
|
||||
|
||||
**核心改动清单:**
|
||||
|
||||
1. **Rust 侧**(zclaw-kernel):
|
||||
- 新增 `artifact_create` / `artifact_list` / `artifact_read` Tauri 命令
|
||||
- 产物写入 `{workspace}/artifacts/{session_key}/`
|
||||
- 中间件链中 ToolEnd 事件触发产物注册
|
||||
|
||||
2. **前端 Store**:
|
||||
- artifactStore 增加 IndexedDB 持久化
|
||||
- 从 streamStore 解耦产物创建逻辑到独立 hook
|
||||
|
||||
3. **前端组件**:
|
||||
- 替换 MarkdownPreview → react-markdown + GFM
|
||||
- 引入 CodeMirror/shiki 代码高亮
|
||||
- 详情视图增加文件下拉切换
|
||||
- RightPanel "文件" tab 合并或移除
|
||||
|
||||
**工作量**:~5-7 天
|
||||
|
||||
#### 方案 C:借鉴 Hermes 防御机制(附加)
|
||||
|
||||
无论选 A 还是 B,都可叠加 Hermes 的大输出防御:
|
||||
|
||||
- 中间件链 ToolOutputGuard 层增加溢出检测
|
||||
- 超阈值产物自动持久化到磁盘,上下文替换为 `<persisted-output>` 预览
|
||||
- agent 可通过 read_file 回读完整内容
|
||||
|
||||
---
|
||||
|
||||
## 四、关键依赖库参考
|
||||
|
||||
| 库 | 用途 | DeerFlow 使用 | 推荐 |
|
||||
|----|------|--------------|------|
|
||||
| react-markdown | Markdown 渲染 | ✅ (Streamdown) | ✅ |
|
||||
| remark-gfm | GFM 表格/删除线/任务列表 | ✅ | ✅ |
|
||||
| rehype-katex | 数学公式渲染 | ✅ | 按需 |
|
||||
| @uiw/react-codemirror | 代码编辑器/高亮 | ✅ | ✅ |
|
||||
| shiki | 静态代码高亮 | ✅ (chat 内代码块) | ✅ |
|
||||
| react-resizable-panels | 分栏布局 | ✅ | 已有 |
|
||||
| @tanstack/react-query | 数据缓存 | ✅ | 可选 |
|
||||
|
||||
---
|
||||
|
||||
## 五、文件索引
|
||||
|
||||
| 参考项目 | 关键路径 |
|
||||
|----------|----------|
|
||||
| DeerFlow 前端 | `G:/deerflow/frontend/src/components/workspace/artifacts/` |
|
||||
| DeerFlow 前端工具 | `G:/deerflow/frontend/src/core/artifacts/` |
|
||||
| DeerFlow 布局 | `G:/deerflow/frontend/src/components/workspace/chats/chat-box.tsx` |
|
||||
| DeerFlow 代码编辑 | `G:/deerflow/frontend/src/components/workspace/code-editor.tsx` |
|
||||
| DeerFlow 后端路由 | `G:/deerflow/backend/app/gateway/routers/artifacts.py` |
|
||||
| DeerFlow 后端工具 | `G:/deerflow/backend/packages/harness/deerflow/tools/builtins/present_file_tool.py` |
|
||||
| Hermes 输出管理 | `G:/hermes-agent-main/tools/tool_result_storage.py` |
|
||||
| Hermes 预算配置 | `G:/hermes-agent-main/tools/budget_config.py` |
|
||||
212
docs/references/deerflow-toolcall-reference.md
Normal file
212
docs/references/deerflow-toolcall-reference.md
Normal file
@@ -0,0 +1,212 @@
|
||||
# DeerFlow 工具调用系统参考文档
|
||||
|
||||
> 调研 DeerFlow 的工具调用完整流程,为 ZCLAW 工具调用问题排查提供参考。
|
||||
> 分析日期:2026-04-24
|
||||
|
||||
---
|
||||
|
||||
## 一、端到端数据流
|
||||
|
||||
```
|
||||
用户消息
|
||||
→ FastAPI Gateway (/api/threads/{id}/runs/stream)
|
||||
→ services.start_run() → asyncio.create_task(run_agent(...))
|
||||
→ LangGraph Agent Graph (create_agent)
|
||||
→ LLM Model (ChatOpenAI / Claude)
|
||||
→ AIMessage (含 tool_calls 列表)
|
||||
→ 14 层 Middleware 链处理
|
||||
→ ToolNode (LangGraph 内置, 按 tool_call.name 路由)
|
||||
→ ToolMessage (执行结果)
|
||||
→ 再次调用 LLM (带着 ToolMessage 继续)
|
||||
→ StreamBridge.publish() → asyncio.Queue
|
||||
→ SSE → 前端 useStream hook
|
||||
→ React 组件渲染
|
||||
```
|
||||
|
||||
## 二、工具注册与执行
|
||||
|
||||
### 2.1 注册入口
|
||||
|
||||
**文件**: `G:/deerflow/backend/packages/harness/deerflow/tools/tools.py` — `get_available_tools()`
|
||||
|
||||
工具来自四个来源:
|
||||
|
||||
| 来源 | 加载方式 | 示例 |
|
||||
|------|----------|------|
|
||||
| Config 工具 | YAML 配置 + 反射导入 (`module:variable`) | `deerflow.sandbox.tools:bash_tool` |
|
||||
| Builtin 工具 | 硬编码导入 | `present_file_tool`, `ask_clarification_tool` |
|
||||
| MCP 工具 | `MultiServerMCPClient` 从 MCP 服务器缓存获取 | 第三方 MCP 工具 |
|
||||
| ACP 工具 | `build_invoke_acp_agent_tool()` 动态构建 | 外部 agent 调用 |
|
||||
|
||||
### 2.2 Sandbox 工具清单
|
||||
|
||||
**文件**: `G:/deerflow/backend/packages/harness/deerflow/sandbox/tools.py`
|
||||
|
||||
| 工具名 | 功能 |
|
||||
|--------|------|
|
||||
| `bash` | 沙箱中执行命令 |
|
||||
| `ls` | 列出目录 |
|
||||
| `read_file` | 读取文件 |
|
||||
| `write_file` | 写入文件(触发产物面板自动打开) |
|
||||
| `str_replace` | 字符串替换(触发产物面板自动打开) |
|
||||
|
||||
### 2.3 Builtin 工具
|
||||
|
||||
**文件**: `G:/deerflow/backend/packages/harness/deerflow/tools/builtins/`
|
||||
|
||||
| 工具 | 功能 |
|
||||
|------|------|
|
||||
| `ask_clarification` | 向用户提问澄清(中断执行等待回复) |
|
||||
| `present_file` | 展示文件给用户(触发产物卡片) |
|
||||
| `setup_agent` | 自定义 agent 创建 |
|
||||
| `task_tool` | 子 agent 任务委派 |
|
||||
| `view_image` | 图片查看(仅视觉模型) |
|
||||
| `tool_search` | 延迟工具搜索(MCP 工具按需暴露) |
|
||||
|
||||
## 三、中间件链(14 层)
|
||||
|
||||
**文件**: `G:/deerflow/backend/packages/harness/deerflow/agents/lead_agent/agent.py` — `_build_middlewares()`
|
||||
|
||||
与工具调用相关的关键中间件:
|
||||
|
||||
### 3.1 DanglingToolCallMiddleware
|
||||
|
||||
**文件**: `dangling_tool_call_middleware.py`
|
||||
|
||||
在 `wrap_model_call` 中检测消息历史中缺失 ToolMessage 的 AIMessage,自动注入占位 ToolMessage:
|
||||
```python
|
||||
ToolMessage(
|
||||
content="[Tool call was interrupted and did not return a result.]",
|
||||
tool_call_id=tc_id,
|
||||
name=tc.get("name", "unknown"),
|
||||
status="error",
|
||||
)
|
||||
```
|
||||
|
||||
### 3.2 ToolErrorHandlingMiddleware
|
||||
|
||||
**文件**: `tool_error_handling_middleware.py`
|
||||
|
||||
在 `wrap_tool_call` 中捕获工具执行异常,转换为错误 ToolMessage 而非让整个 run 崩溃。
|
||||
|
||||
### 3.3 LoopDetectionMiddleware
|
||||
|
||||
**文件**: `loop_detection_middleware.py`
|
||||
|
||||
在 `after_model` 中检测重复工具调用:
|
||||
- 阈值 3 次 → 注入警告 HumanMessage
|
||||
- 阈值 5 次 → 直接清空 tool_calls,强制 LLM 产出文本回答
|
||||
|
||||
### 3.4 DeferredToolFilterMiddleware
|
||||
|
||||
**文件**: `deferred_tool_filter_middleware.py`
|
||||
|
||||
在 `wrap_model_call` 中过滤延迟注册的 MCP 工具 schema,仅在 LLM 通过 `tool_search` 发现后才暴露。
|
||||
|
||||
### 3.5 ClarificationMiddleware
|
||||
|
||||
拦截 `ask_clarification` 工具调用,中断执行等待用户回复。
|
||||
|
||||
### 3.6 SubagentLimitMiddleware
|
||||
|
||||
截断过多的并行子 agent 调用。
|
||||
|
||||
## 四、工具结果回传
|
||||
|
||||
### 4.1 格式
|
||||
|
||||
LangChain 的 `ToolMessage`,包含:
|
||||
- `content`: 执行结果文本
|
||||
- `tool_call_id`: 匹配 AIMessage 中的 tool_call ID
|
||||
- `name`: 工具名称
|
||||
- `status`: `"error"` 或省略
|
||||
|
||||
### 4.2 特殊工具
|
||||
|
||||
`present_file_tool` 返回 `Command` 而非纯字符串,同时更新 `artifacts` 和 `messages` 两个 state channel。
|
||||
|
||||
## 五、前端工具调用展示
|
||||
|
||||
### 5.1 消息分组
|
||||
|
||||
**文件**: `G:/deerflow/frontend/src/core/messages/utils.ts` — `groupMessages()`
|
||||
|
||||
| 分组类型 | 触发条件 | 展示 |
|
||||
|----------|----------|------|
|
||||
| `assistant:processing` | AI 消息含 tool_calls 或 reasoning | MessageGroup (折叠) |
|
||||
| `assistant` | AI 消息有文本无 tool_calls | MessageListItem (气泡) |
|
||||
| `assistant:present-files` | 含 present_files tool call | ArtifactFileList |
|
||||
| `assistant:clarification` | ask_clarification 结果 | MarkdownContent |
|
||||
| `assistant:subagent` | 含 task tool call | SubtaskCard |
|
||||
|
||||
### 5.2 工具状态推断
|
||||
|
||||
前端**没有显式状态机**。通过消息序列推断:
|
||||
- AI 消息含 tool_calls 但无对应 ToolMessage → 正在执行
|
||||
- ToolMessage 出现 → 执行完成
|
||||
- `assistant:processing` 组由 `ChainOfThought` 折叠组件包裹
|
||||
|
||||
### 5.3 工具调用 UI
|
||||
|
||||
**文件**: `message-group.tsx` 第 186-423 行
|
||||
|
||||
按工具名渲染不同图标和内容:
|
||||
- `bash` → 终端图标 + 命令代码块
|
||||
- `read_file`/`write_file`/`str_replace` → 文件图标 + 路径链接(点击打开产物面板)
|
||||
- `web_search` → 搜索图标 + 结果链接
|
||||
- 默认 → 扳手图标 + 工具名
|
||||
|
||||
## 六、流式处理中的工具调用
|
||||
|
||||
### 6.1 架构
|
||||
|
||||
```
|
||||
agent.astream(stream_mode=["values"])
|
||||
→ StreamBridge (asyncio.Queue per run, maxsize=256)
|
||||
→ sse_consumer() → SSE frames → 前端
|
||||
```
|
||||
|
||||
### 6.2 关键特征
|
||||
|
||||
- 工具调用**不中断**流。LangGraph 自动在 agent_node 和 tool_node 之间路由
|
||||
- 每次状态变更产出完整的 `values` 快照,前端通过 `seen_ids` 去重
|
||||
- 15 秒心跳包保持 SSE 连接
|
||||
|
||||
### 6.3 前端看到的事件序列
|
||||
|
||||
1. `values` 事件: 含 `tool_calls` 的 AIMessage
|
||||
2. `values` 事件: ToolMessage(工具结果)
|
||||
3. `values` 事件: LLM 基于工具结果的最终回答
|
||||
|
||||
整个过程连续,不中断 SSE 连接。
|
||||
|
||||
## 七、与 ZCLAW 对比(工具调用)
|
||||
|
||||
| 维度 | DeerFlow | ZCLAW |
|
||||
|------|----------|-------|
|
||||
| 框架 | LangGraph (graph-based) | 自研 loop_runner (循环) |
|
||||
| 工具生命周期 | LangGraph ToolNode 自动管理 | 手动 ToolRegistry + loop_runner |
|
||||
| after_tool_call 中间件 | ✅ wrap_tool_call 钩子完整 | ❌ 流式和非流式模式均未调用 |
|
||||
| 并行工具执行 | LangGraph 自动处理 | 非流式有 JoinSet,流式全串行 |
|
||||
| 悬挂修复 | DanglingToolCallMiddleware | DanglingToolMiddleware (有) |
|
||||
| 错误恢复 | ToolErrorHandlingMiddleware (异常→ToolMessage) | ToolErrorMiddleware (计数器) |
|
||||
| 循环检测 | LoopDetectionMiddleware (3次警告/5次强停) | LoopGuardMiddleware (有) |
|
||||
| 前端状态 | 消息序列推断 | 显式 ToolCallStep 状态机 |
|
||||
| MCP 工具 | 延迟注册 + tool_search 按需暴露 | 全量注册 |
|
||||
|
||||
## 八、关键文件索引
|
||||
|
||||
| 功能 | DeerFlow 文件 |
|
||||
|------|-------------|
|
||||
| Agent 工厂 | `backend/packages/harness/deerflow/agents/lead_agent/agent.py` |
|
||||
| 中间件组装 | `backend/packages/harness/deerflow/agents/factory.py` |
|
||||
| 工具注册 | `backend/packages/harness/deerflow/tools/tools.py` |
|
||||
| Sandbox 工具 | `backend/packages/harness/deerflow/sandbox/tools.py` |
|
||||
| Builtin 工具 | `backend/packages/harness/deerflow/tools/builtins/` |
|
||||
| 错误处理中间件 | `agents/middlewares/tool_error_handling_middleware.py` |
|
||||
| 悬挂修复中间件 | `agents/middlewares/dangling_tool_call_middleware.py` |
|
||||
| 循环检测中间件 | `agents/middlewares/loop_detection_middleware.py` |
|
||||
| 延迟过滤中间件 | `agents/middlewares/deferred_tool_filter_middleware.py` |
|
||||
| 流式 Bridge | `runtime/stream_bridge/memory.py` |
|
||||
| 前端消息分组 | `frontend/src/core/messages/utils.ts` |
|
||||
| 前端工具调用组件 | `frontend/src/components/workspace/messages/message-group.tsx` |
|
||||
141
docs/references/zclaw-toolcall-issues.md
Normal file
141
docs/references/zclaw-toolcall-issues.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# ZCLAW 工具调用问题分析
|
||||
|
||||
> 对比 DeerFlow 工具调用系统,排查 ZCLAW 工具调用问题。
|
||||
> 分析日期:2026-04-24
|
||||
> 更新日期:2026-04-24(P0+P0-stream_errored 已修复)
|
||||
|
||||
---
|
||||
|
||||
## 一、发现的问题
|
||||
|
||||
### P0: `after_tool_call` 中间件从未被调用 — ✅ 已修复 (2026-04-24)
|
||||
|
||||
**文件**: `crates/zclaw-runtime/src/loop_runner.rs`
|
||||
|
||||
在 `run()`(非流式,第 400-558 行)和 `run_streaming`(流式,第 893-1070 行)中,工具执行后直接 push `Message::tool_result` 到消息历史,**没有调用 `middleware_chain.run_after_tool_call()`**。
|
||||
|
||||
**影响**:
|
||||
- `ToolErrorMiddleware.after_tool_call` 的错误计数和恢复消息逻辑不生效
|
||||
- `ToolOutputGuardMiddleware.after_tool_call` 的敏感信息检测不生效
|
||||
- 工具错误只能靠工具自身的错误返回传递,中间件层的防护形同虚设
|
||||
|
||||
**DeerFlow 对比**: `ToolErrorHandlingMiddleware` 通过 `wrap_tool_call` 钩子完整包裹每次工具执行。
|
||||
|
||||
### P0: `stream_errored` 跳过所有工具执行 — ✅ 已修复 (2026-04-24)
|
||||
|
||||
**文件**: `crates/zclaw-runtime/src/loop_runner.rs` 第 872-876 行
|
||||
|
||||
流式模式中,当 LLM 流出现任何错误(网络超时、API 错误、驱动错误)时,`stream_errored = true`,然后 `break 'outer` 直接退出循环,**跳过所有已解析的工具调用**。
|
||||
|
||||
**影响**:
|
||||
- ToolStart 事件已发送给前端(用户看到"执行中"按钮),但工具从未实际执行
|
||||
- ToolEnd 事件永远不会发送 → 前端工具状态卡在"执行中"
|
||||
- 已完整接收(ToolUseEnd)的工具调用也被丢弃
|
||||
|
||||
**修复**: 区分完整工具(收到 ToolUseEnd)和不完整工具(仅收到 ToolUseStart/Delta)。完整工具照常执行,不完整工具发送取消 ToolEnd 事件。
|
||||
|
||||
### P1: 流式模式工具全串行 — ✅ 已修复 (2026-04-24)
|
||||
|
||||
**文件**: `loop_runner.rs` 流式模式工具执行段
|
||||
|
||||
非流式模式有 `JoinSet` + `Semaphore(3)` 并行执行 ReadOnly 工具,但流式模式用简单 `for` 循环串行执行所有工具。
|
||||
|
||||
**修复**: 流式模式采用三阶段执行:Phase 1 中间件预检(serial) → Phase 2 并行+串行分区执行 → Phase 3 after_tool_call + 结果排序推送。
|
||||
|
||||
### P2: OpenAI 驱动工具参数静默替换 — ✅ 已修复 (2026-04-24)
|
||||
|
||||
**文件**: `crates/zclaw-runtime/src/driver/openai.rs` 第 222-228 行
|
||||
|
||||
```rust
|
||||
let parsed_args = if args.is_empty() {
|
||||
serde_json::json!({})
|
||||
} else {
|
||||
serde_json::from_str(args).unwrap_or_else(|e| {
|
||||
tracing::warn!("Failed to parse tool args '{}': {}", args, e);
|
||||
serde_json::json!({})
|
||||
})
|
||||
};
|
||||
```
|
||||
|
||||
JSON 解析失败时静默替换为 `{}`,结合 loop_runner.rs 的空参数处理(第 412-423 行),会注入 `_fallback_query` 替代实际参数。
|
||||
|
||||
**修复**: 解析失败时返回 `_parse_error` + `_raw_args` 字段,让工具和 LLM 能感知到参数问题并自我修正。
|
||||
|
||||
### P2: ToolOutputGuard 过于激进 — ✅ 已修复 (2026-04-24)
|
||||
|
||||
**文件**: `crates/zclaw-runtime/src/middleware/tool_output_guard.rs` 第 109 行
|
||||
|
||||
使用 `to_lowercase()` 匹配敏感模式,合法内容中包含 "password"、"system:" 等字符串会被误拦。
|
||||
|
||||
**修复**: 改用 `regex` 精确匹配实际密钥值格式(如 `sk-[a-zA-Z0-9]{20,}`、`AKIA[A-Z0-9]{16}`、`key=value` 模式),不再误拦仅包含关键词的合法内容。移除了 "system:" 等过于宽泛的注入检测模式。
|
||||
|
||||
### P2: ToolErrorMiddleware 失败计数器是全局的 — ✅ 已修复 (2026-04-24)
|
||||
|
||||
**文件**: `crates/zclaw-runtime/src/middleware/tool_error.rs` 第 27 行
|
||||
|
||||
`consecutive_failures: AtomicU32` 是结构体字段,所有 session 共享。高并发下 A session 失败 2 次 + B session 失败 1 次就会触发 AbortLoop(阈值 3)。
|
||||
|
||||
**修复**: 改用 `Mutex<HashMap<String, u32>>` 以 session_id 为 key 存储计数,每个会话独立跟踪。
|
||||
|
||||
### P3: Gateway 客户端 onTool 回调语义不一致 — ✅ 已修复 (2026-04-24)
|
||||
|
||||
**文件**: `desktop/src/lib/gateway-client.ts` 第 698-707 行
|
||||
|
||||
`tool_call` 和 `tool_result` 两个 case 共用 `onTool` 回调,但参数约定不同,调用者必须通过 `output` 是否为空判断 start/end。
|
||||
|
||||
**修复**: 明确 `tool_call` 的 output 始终为 `''`(修复了可能传递 data.output 的问题),添加清晰注释说明 start/end 语义约定。
|
||||
|
||||
---
|
||||
|
||||
## 二、根因分析
|
||||
|
||||
工具调用问题最常见的故障模式:
|
||||
|
||||
1. **LLM 返回的 tool_call 参数格式错误** → OpenAI 驱动静默替换为 `{}` → 工具以空参数执行 → 结果不符合预期
|
||||
2. **工具执行异常** → after_tool_call 中间件未调用 → 错误未格式化 → LLM 收到原始错误信息无法恢复
|
||||
3. **流被中断后重连** → DanglingToolMiddleware 修复悬挂 → 但如果修复逻辑本身有 bug(如重复修补),会导致消息膨胀
|
||||
|
||||
## 三、修复建议
|
||||
|
||||
### 修复 1: 在 loop_runner 中调用 after_tool_call
|
||||
|
||||
**优先级**: P0
|
||||
**影响文件**: `loop_runner.rs`
|
||||
|
||||
在非流式模式的工具执行循环中(约第 530 行),工具执行后调用:
|
||||
```rust
|
||||
let after_result = middleware_chain.run_after_tool_call(
|
||||
&name, &input_json, &output_str, &mut ctx
|
||||
).await;
|
||||
```
|
||||
|
||||
在流式模式的工具执行后(约第 1020 行),同样调用。
|
||||
|
||||
### 修复 2: 将 ToolErrorMiddleware 计数器改为 per-session
|
||||
|
||||
**优先级**: P2
|
||||
**影响文件**: `middleware/tool_error.rs`
|
||||
|
||||
使用 `HashMap<String, u32>` 以 session_id 为 key 存储计数。
|
||||
|
||||
### 修复 3: ToolOutputGuard 改为精确匹配
|
||||
|
||||
**优先级**: P2
|
||||
**影响文件**: `middleware/tool_output_guard.rs`
|
||||
|
||||
只在检测到独立的密钥值时触发(如 `sk-[48字符]`),而非单词级匹配。
|
||||
|
||||
---
|
||||
|
||||
## 四、关键文件
|
||||
|
||||
| 文件 | 作用 |
|
||||
|------|------|
|
||||
| `crates/zclaw-runtime/src/loop_runner.rs` | 主循环,工具调度 |
|
||||
| `crates/zclaw-runtime/src/tool.rs` | ToolRegistry + Tool trait |
|
||||
| `crates/zclaw-runtime/src/middleware/tool_error.rs` | 工具错误处理 |
|
||||
| `crates/zclaw-runtime/src/middleware/tool_output_guard.rs` | 输出安全检查 |
|
||||
| `crates/zclaw-runtime/src/middleware/dangling_tool.rs` | 断裂工具修复 |
|
||||
| `crates/zclaw-runtime/src/driver/openai.rs` | OpenAI 兼容驱动 |
|
||||
| `desktop/src/lib/gateway-client.ts` | 前端通信客户端 |
|
||||
| `desktop/src/store/chat/streamStore.ts` | 前端流式处理 |
|
||||
35
wiki/chat.md
35
wiki/chat.md
@@ -1,6 +1,6 @@
|
||||
---
|
||||
title: 聊天系统
|
||||
updated: 2026-04-22
|
||||
updated: 2026-04-23
|
||||
status: active
|
||||
tags: [module, chat, stream]
|
||||
---
|
||||
@@ -17,6 +17,7 @@ tags: [module, chat, stream]
|
||||
| 5 Store 拆分 | 原 908 行 ChatStore → stream/conversation/message/chat/artifact,单一职责 |
|
||||
| 5 分钟超时守护 | 防止流挂起: kernel-chat.ts:76,超时自动 cancelStream |
|
||||
| 统一回调接口 | 3 种实现共享 `{ onDelta, onThinkingDelta, onTool, onHand, onComplete, onError }` |
|
||||
| LLM 动态建议 | 替换硬编码关键词匹配,用 LLM 生成个性化建议(1深入追问+1实用行动+1管家关怀),4路并行预取智能上下文 |
|
||||
|
||||
### ChatStream 实现
|
||||
|
||||
@@ -33,11 +34,14 @@ tags: [module, chat, stream]
|
||||
|
||||
| 文件 | 职责 |
|
||||
|------|------|
|
||||
| `desktop/src/store/chat/streamStore.ts` | 流式消息编排、发送、取消 |
|
||||
| `desktop/src/store/chat/streamStore.ts` | 流式消息编排、发送、取消、LLM 动态建议生成 |
|
||||
| `desktop/src/store/chat/conversationStore.ts` | 会话管理、当前模型、sessionKey |
|
||||
| `desktop/src/store/chat/messageStore.ts` | 消息持久化 (IndexedDB) |
|
||||
| `desktop/src/lib/kernel-chat.ts` | KernelClient ChatStream (Tauri) |
|
||||
| `desktop/src/lib/suggestion-context.ts` | 4路并行智能上下文拉取 (用户画像/痛点/经验/技能匹配) |
|
||||
| `desktop/src/lib/cold-start-mapper.ts` | 冷启动配置映射 (行业检测/命名/个性/技能) |
|
||||
| `desktop/src/components/ChatArea.tsx` | 聊天区域 UI |
|
||||
| `desktop/src/components/ai/SuggestionChips.tsx` | 动态建议芯片展示 |
|
||||
| `crates/zclaw-runtime/src/loop_runner.rs` | Rust 主聊天循环 + 中间件链 |
|
||||
|
||||
### 发送消息流
|
||||
@@ -100,6 +104,20 @@ UI 选择模型 → conversationStore.currentModel = newModel
|
||||
- cancelStream 设置原子标志位,与 onDelta 回调无竞态
|
||||
- 3 种 ChatStream 共享同一套回调接口,上层代码无需感知实现差异
|
||||
- 消息持久化走 messageStore → IndexedDB,与流式渲染解耦
|
||||
- 动态建议 4 路并行预取 (userProfile/painPoints/experiences/skillMatch),500ms 超时降级为空串
|
||||
- 建议生成与 memory extraction 解耦 — 不等 memory LLM 调用完成即启动建议
|
||||
|
||||
### LLM 动态建议
|
||||
|
||||
```
|
||||
sendMessage → isStreaming=true + _activeSuggestionContextPrefetch = fetchSuggestionContext(...)
|
||||
→ 流式响应中 prefetch 在后台执行
|
||||
onComplete → createCompleteHandler
|
||||
→ generateLLMSuggestions(prefetchedContext) — 立即启动不等 memory
|
||||
→ prompt: 1 深入追问 + 1 实用行动 + 1 管家关怀
|
||||
→ memory/reflection 后台独立运行 (Promise.all)
|
||||
→ SuggestionChips 渲染
|
||||
```
|
||||
|
||||
### Tauri 命令
|
||||
|
||||
@@ -114,6 +132,8 @@ UI 选择模型 → conversationStore.currentModel = newModel
|
||||
|
||||
| 问题 | 状态 | 说明 |
|
||||
|------|------|------|
|
||||
| after_tool_call 中间件未调用 | ✅ 已修复 (04-24) | 流式+非流式均添加调用,ToolErrorMiddleware/ToolOutputGuard 现在生效 |
|
||||
| stream_errored 跳过所有工具 | ✅ 已修复 (04-24) | 完整工具照常执行,不完整工具发送取消事件 |
|
||||
| B-CHAT-07 混合域截断 | P2 Open | 跨域消息时可能截断上下文 |
|
||||
| SSE Token 统计为 0 | ✅ 已修复 | SseUsageCapture stream_done flag |
|
||||
| Tauri invoke 参数名 | ✅ 已修复 (f6c5dd2) | camelCase 格式 |
|
||||
@@ -122,14 +142,15 @@ UI 选择模型 → conversationStore.currentModel = newModel
|
||||
**注意事项:**
|
||||
- 辅助 LLM 调用 (记忆摘要/提取、管家路由) 复用 `kernel_init` 的 model+base_url,与聊天同链路
|
||||
- 课堂聊天是独立 Tauri 命令 (`classroom_chat`),不走 `agent_chat_stream`
|
||||
- Agent tab 已移除 — 跨会话身份由 soul.md 接管,不再通过 RightPanel 管理
|
||||
|
||||
## 5. 变更日志
|
||||
|
||||
| 日期 | 变更 |
|
||||
|------|------|
|
||||
| 04-24 | 工具调用 P0 修复: after_tool_call 中间件接入(流式+非流式) + stream_errored 工具抢救(完整工具执行+不完整工具取消) |
|
||||
| 04-24 | 产物系统优化: MarkdownRenderer 提取共享 + ArtifactPanel react-markdown 渲染 + 文件选择器下拉 + 数据源扩展(file_write/str_replace 两路径) + artifactStore IndexedDB 持久化 |
|
||||
| 04-23 | 建议 prefetch: sendMessage 时启动 context 预取,流结束后立即消费,不等 memory extraction |
|
||||
| 04-23 | 建议 prompt 重写: 1深入追问+1实用行动+1管家关怀,上下文窗口 6→20 条 |
|
||||
| 04-23 | 身份信号: detectAgentNameSuggestion 前端即时检测 + RightPanel 监听 Tauri 事件刷新名称 |
|
||||
| 04-22 | Wiki 重写: 5 节模板,增加集成契约和不变量 |
|
||||
| 04-21 | 上一轮更新 |
|
||||
| 04-17 | ChatStore 拆分为 5 Store (stream/conversation/message/chat/artifact) |
|
||||
| 04-16 | Provider Key 解密修复 (b69dc61) |
|
||||
| 04-16 | Tauri invoke 参数名修复 (f6c5dd2) |
|
||||
| 04-23 | Agent tab 移除: RightPanel 清理 ~280 行 dead code,身份由 soul.md 接管 |
|
||||
|
||||
@@ -133,6 +133,18 @@ skills/ -> SkillRegistry 加载 -> SkillIndexMiddleware@200 注入系统提示
|
||||
- MCP 限定名 `service_name.tool_name` 避免与内置工具冲突
|
||||
- 已删除空壳 Hands (04-17): Whiteboard/Slideshow/Speech,净减 ~5400 行
|
||||
|
||||
### ⚡ 新增工具/技能必须声明 concurrency 级别
|
||||
|
||||
`Tool` trait 的 `concurrency()` 方法决定并行执行策略 (04-24 Hermes Phase 2A):
|
||||
|
||||
| 级别 | 含义 | 适用场景 |
|
||||
|------|------|---------|
|
||||
| `ReadOnly` (默认) | 只读,始终可并行 | file_read, web_search, calculator |
|
||||
| `Exclusive` | 有副作用,必须串行 | file_write, shell_exec, send_message, execute_skill, task |
|
||||
| `Interactive` | 需要用户交互,永不并行 | ask_clarification |
|
||||
|
||||
**新增工具时**:在 `impl Tool for YourTool` 中覆盖 `concurrency()` 方法。默认 `ReadOnly`,如果有写操作/副作用必须返回 `ToolConcurrency::Exclusive`。未正确声明会导致并行执行时产生竞态条件。
|
||||
|
||||
## 4. 活跃问题 + 陷阱
|
||||
|
||||
### 活跃
|
||||
@@ -155,6 +167,7 @@ skills/ -> SkillRegistry 加载 -> SkillIndexMiddleware@200 注入系统提示
|
||||
|
||||
| 日期 | 变更 | 关联 |
|
||||
|------|------|------|
|
||||
| 2026-04-24 | Hermes Phase 2A: ToolConcurrency 枚举 + 并行执行 + concurrency() 声明要求 | commit 9060935 |
|
||||
| 2026-04-22 | Wiki 5-section 重构: 281->~195 行,语义路由细节引用 [[butler]] | wiki/ |
|
||||
| 2026-04-22 | Researcher 搜索修复: schema 扁平化 + 空参数回退 + 排版修复 | commit 5816f56+81005c3 |
|
||||
| 2026-04-17 | 空壳 Hand 清理: Whiteboard/Slideshow/Speech 删除,净减 ~5400 行 | Phase 5 清理 |
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
title: ZCLAW 项目知识库
|
||||
updated: 2026-04-22
|
||||
updated: 2026-04-24
|
||||
status: active
|
||||
---
|
||||
|
||||
@@ -8,29 +8,29 @@ status: active
|
||||
|
||||
> 面向中文用户的 AI Agent 桌面客户端。管家模式 + 多模型 + 7 自主能力 + 75 技能。
|
||||
> **使用方式**: 找到你要处理的模块,读对应页面,直接开始工作。
|
||||
> **数据来源**: 2026-04-22 代码全量扫描验证,非文档推测。
|
||||
> **数据来源**: 2026-04-23 代码全量扫描验证,非文档推测。
|
||||
|
||||
## 项目画像
|
||||
|
||||
| 维度 | 值 |
|
||||
|------|-----|
|
||||
| 定位 | AI Agent 桌面客户端 (Tauri 2.x) |
|
||||
| 技术栈 | Rust 10 crates + src-tauri (~102K行, 357 .rs) + React 19 + TypeScript + PostgreSQL |
|
||||
| 技术栈 | Rust 10 crates + src-tauri (~148K行, 384 .rs) + React 19 + TypeScript + PostgreSQL |
|
||||
| 阶段 | 发布前稳定化,功能冻结中 |
|
||||
|
||||
## 关键数字(2026-04-22 代码验证)
|
||||
## 关键数字(2026-04-23 代码验证)
|
||||
|
||||
| 指标 | 值 |
|
||||
|------|-----|
|
||||
| Rust Crates | 10 + src-tauri |
|
||||
| Rust 代码 | 101,967 行 (357 .rs文件) |
|
||||
| Rust 测试 | 987 定义 / 797 通过 |
|
||||
| Tauri 命令 | 190 定义 / 97 @reserved / 104 invoke |
|
||||
| Rust 代码 | 148,185 行 (384 .rs文件) |
|
||||
| Rust 测试 | 997 定义 (619 #[test] + 378 #[tokio::test]) |
|
||||
| Tauri 命令 | 193 定义 / 104 invoke |
|
||||
| SaaS API | 137 .route() / 16 模块 / 38 SQL 迁移 / 42 表 |
|
||||
| 中间件 | 14 层 runtime + 10 层 SaaS HTTP |
|
||||
| SKILL / HAND | 75 技能目录 / 7 注册 Hand (6 TOML + _reminder) |
|
||||
| Pipeline | 18 YAML 模板 (8 目录) |
|
||||
| 前端 | 25 Store / 102 组件 / 75 lib / 17 Admin 页面 |
|
||||
| 前端 | 25 Store / 103 组件 / 78 lib / 17 Admin 页面 |
|
||||
| Intelligence | 16 .rs 文件 |
|
||||
| 质量指标 | 0 cargo warnings / 2 TODO/FIXME / 0 dead_code |
|
||||
|
||||
@@ -38,13 +38,13 @@ status: active
|
||||
|
||||
| 类别 | 功能 | 入口 | Wiki |
|
||||
|------|------|------|------|
|
||||
| 对话 | 发消息、流式响应、多模型切换 | 聊天面板 | [[chat]] |
|
||||
| 分身 | 创建/切换/配置 Agent | 侧边栏 Agent 列表 | [[chat]] |
|
||||
| 对话 | 发消息、流式响应、多模型切换、LLM 动态建议 | 聊天面板 | [[chat]] |
|
||||
| 分身 | 创建/切换/配置 Agent、跨会话身份记忆 (soul.md) | 侧边栏 Agent 列表 | [[chat]] |
|
||||
| 自主 | 触发 Browser/Collector/Twitter 等 | 自动化面板 | [[hands-skills]] |
|
||||
| 记忆 | 搜索历史、自动注入上下文 | 设置 > 语义记忆 | [[memory]] |
|
||||
| 记忆 | 搜索历史、自动注入上下文、身份信号提取 | 设置 > 语义记忆 | [[memory]] |
|
||||
| 配置 | 模型/API/工作区/安全存储 | 设置面板 (19 页) | [[development]] |
|
||||
| SaaS | 登录注册、订阅计费、Admin 管理 | SaaS 平台 / Admin 后台 | [[saas]] |
|
||||
| 管家 | 痛点积累、行业配置、简洁/专业模式 | 聊天面板 (默认模式) | [[butler]] |
|
||||
| 管家 | 痛点积累、行业配置、简洁/专业模式、跨会话身份、动态建议 | 聊天面板 (默认模式) | [[butler]] |
|
||||
| Pipeline | YAML 模板选择、配置、DAG 执行 | 工作流面板 | [[pipeline]] |
|
||||
| 安全 | JWT 认证、TOTP 2FA、操作审计 | 设置 > 安全存储 | [[security]] |
|
||||
| 数据 | PostgreSQL (42表) + SQLite/FTS5 (本地记忆) | — | [[data-model]] |
|
||||
@@ -97,5 +97,7 @@ ZCLAW
|
||||
| Agent 创建失败 | [[chat]] | [[saas]] | 权限或持久化问题 |
|
||||
| Pipeline 执行卡住 | [[pipeline]] | [[middleware]] | DAG 循环 / 依赖缺失 |
|
||||
| Admin 页面 403 | [[saas]] | [[security]] | JWT 过期 / admin_guard 拦截 |
|
||||
| Agent 名字不记住 | [[butler]] | [[memory]] | soul.md 写入失败 / identity signal 未提取 |
|
||||
| 建议不个性化 | [[chat]] | [[butler]] | 4路上下文超时 / ExperienceExtractor 未初始化 |
|
||||
|
||||
> 数字真相源: `docs/TRUTH.md` — 如有冲突以代码实际为准
|
||||
|
||||
47
wiki/log.md
47
wiki/log.md
@@ -1,6 +1,6 @@
|
||||
---
|
||||
title: 变更日志
|
||||
updated: 2026-04-22
|
||||
updated: 2026-04-24
|
||||
status: active
|
||||
tags: [log, history]
|
||||
---
|
||||
@@ -9,10 +9,55 @@ tags: [log, history]
|
||||
|
||||
> Append-only 操作记录。格式: `## [日期] 类型 | 描述`
|
||||
|
||||
## [2026-04-24] fix(runtime+middleware) | 工具调用 P1/P2/P3 全面修复
|
||||
- **P1 流式工具并行**: 三阶段执行 (中间件预检→并行+串行分区→结果排序),ReadOnly 工具 JoinSet+Semaphore(3)
|
||||
- **P2 OpenAI 驱动**: 参数解析失败不再静默替换为 `{}`,改为返回 `_parse_error`+`_raw_args` 让 LLM 自我修正
|
||||
- **P2 ToolOutputGuard**: 从关键词匹配改为 regex 精确匹配实际密钥值 (sk-xxx/AKIA/PEM 等),消除误拦
|
||||
- **P2 ToolErrorMiddleware**: 失败计数器从全局 AtomicU32 改为 per-session HashMap,消除跨会话误触发
|
||||
- **P3 Gateway client**: 明确 tool_call/tool_result 的 onTool 回调语义约定 (output='' 为 start, input='' 为 end)
|
||||
- **测试**: 91 tests PASS, tsc --noEmit PASS
|
||||
|
||||
## [2026-04-24] fix(runtime) | 工具调用两个 P0 修复
|
||||
- **P0: after_tool_call 中间件从未调用**: 流式+非流式模式均添加 `middleware_chain.run_after_tool_call()` 调用,ToolErrorMiddleware 和 ToolOutputGuardMiddleware 的 after 逻辑现在生效
|
||||
- **P0: stream_errored 跳过所有工具**: 流式模式中 `stream_errored` 不再 `break 'outer`,改为区分完整工具(ToolUseEnd 已接收)和不完整工具;完整工具照常执行,不完整工具发送取消 ToolEnd 事件
|
||||
- **影响文件**: `loop_runner.rs`
|
||||
- **测试**: 91 tests PASS, 0 cargo warnings
|
||||
|
||||
## [2026-04-24] feat(artifact) | 产物系统优化完善
|
||||
- **MarkdownRenderer**: 从 StreamingText 提取共享 Markdown 渲染组件(react-markdown + remark-gfm),ArtifactPanel 复用
|
||||
- **ArtifactPanel**: 替换手写 30 行 MarkdownPreview → 完整 GFM 渲染(表格/代码块/列表/引用);添加文件选择器下拉菜单
|
||||
- **数据源扩展**: 产物创建从 file_write 单工具 → file_write/str_replace/write_file/str_replace_editor;从 sendMessage 单路径 → sendMessage + initStreamListener 双路径
|
||||
- **持久化**: artifactStore 添加 zustand persist + IndexedDB (复用 idb-storage),刷新后产物保留
|
||||
- **验证**: tsc --noEmit PASS, 343 vitest PASS
|
||||
|
||||
## [2026-04-24] perf | Hermes 高价值设计实施 Phase 1-4
|
||||
- **Phase 1**: Anthropic prompt caching — cache_control ephemeral + cache token tracking (CompletionResponse + StreamChunk)
|
||||
- **Phase 2A**: 并行工具执行 — ToolConcurrency 枚举 (ReadOnly/Exclusive/Interactive) + JoinSet + Semaphore(3) + AtomicU32
|
||||
- **Phase 2B**: 工具输出修剪 — prune_tool_outputs() (2000→500 chars) + 集成到 CompactionMiddleware
|
||||
- **Phase 3**: 错误分类+智能重试 — LlmErrorKind + ClassifiedLlmError + RetryDriver (jittered backoff) + CONTEXT_OVERFLOW recovery
|
||||
- **Phase 4**: 异步压缩+迭代摘要 — 30s 防抖 + cached fallback + previous_summary 迭代累积
|
||||
- **新增文件**: error_classifier.rs, retry_driver.rs
|
||||
- **验证**: 997 workspace tests PASS
|
||||
|
||||
## [2026-04-23] perf | 回复效率+建议生成并行化优化 (三部分)
|
||||
- **perf(src-tauri)**: identity prompt 缓存 (`LazyLock<RwLock<HashMap>>`) + `pre_conversation_hook` 并行化 (`tokio::join!`)
|
||||
- **perf(runtime)**: middleware `before_completion` 分波并行 — `parallel_safe()` trait + wave detection + `tokio::spawn`,5 层 safe 中间件可并行
|
||||
- **perf(desktop)**: suggestion context 预取 (sendMessage 时启动) + generateLLMSuggestions 与 memory extraction 解耦
|
||||
- **feat(desktop)**: suggestion prompt 重写 (1深入追问+1实用行动+1管家关怀) + 上下文窗口 6→20 条
|
||||
- **文件**: intelligence_hooks.rs, middleware.rs, 5 个 middleware 子模块, streamStore.ts, llm-service.ts
|
||||
- **验证**: cargo test --workspace --exclude zclaw-saas 0 fail, tsc --noEmit 0 error
|
||||
|
||||
## [2026-04-23] fix | Agent 命名检测重构+跨会话记忆修复+Agent tab 移除
|
||||
- **fix(desktop)**: `detectAgentNameSuggestion` 从 6 个固定正则改为 trigger+extract 两步法 (10 个 trigger)
|
||||
- **fix(desktop)**: 名字检测从 memory extraction 解耦 — 502 不再阻断面板刷新
|
||||
- **fix(src-tauri)**: `agent_update` 同步写入 soul.md — config.name → system prompt 断链修复
|
||||
|
||||
## [2026-04-23] feat | 动态建议智能化
|
||||
- **feat(src-tauri)**: 新增 `experience_find_relevant` Tauri 命令 + `ExperienceBrief` 结构 + OnceLock 单例
|
||||
- **feat(desktop)**: 新增 `suggestion-context.ts` — 4 路并行拉取智能上下文(用户画像/痛点/经验/技能匹配)
|
||||
- **feat(desktop)**: `streamStore.ts` createCompleteHandler 并行化 + generateLLMSuggestions 增强
|
||||
- **feat(desktop)**: suggestion prompt 改为混合型(2 续问 + 1 管家关怀)
|
||||
- **文件**: experience.rs, lib.rs, suggestion-context.ts, streamStore.ts, llm-service.ts
|
||||
- **refactor(desktop)**: 移除 Agent tab (简洁模式/专业模式),清理 dead code (~280 行)
|
||||
- **验证**: cargo check 0 error, tsc --noEmit 0 error
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
title: 中间件链
|
||||
updated: 2026-04-22
|
||||
updated: 2026-04-23
|
||||
status: active
|
||||
tags: [module, middleware, runtime]
|
||||
---
|
||||
@@ -17,6 +17,7 @@ tags: [module, middleware, runtime]
|
||||
- **WHY 注册顺序 != 执行顺序**: `kernel/mod.rs` 中 14 次 `chain.register()` 的代码顺序与运行时顺序无关,chain 按 `priority()` 升序排列后执行。
|
||||
- **WHY 6 类 14 层**: 进化(70-79) -> 路由(80-99) -> 上下文(100-199) -> 能力(200-399) -> 安全(400-599) -> 遥测(600-799),优先级范围即执行阶段。
|
||||
- **WHY Stop/Block/AbortLoop**: 细粒度流控 -- Stop 中断 LLM 循环,Block 阻止单次工具调用,AbortLoop 终止整个 Agent 循环。命中后跳过所有后续中间件。
|
||||
- **WHY 分波并行 (parallel_safe)**: `before_completion` 阶段,只修改 `system_prompt` 的中间件可声明 `parallel_safe() == true`,连续的 parallel-safe 中间件通过 `tokio::spawn` 并行执行,各自持有 `MiddlewareContext` clone,完成后合并 prompt 贡献。降低串行延迟 ~1-3s。
|
||||
|
||||
## 2. 关键文件 + 数据流
|
||||
|
||||
@@ -34,8 +35,10 @@ tags: [module, middleware, runtime]
|
||||
```
|
||||
用户消息 -> AgentLoop
|
||||
-> chain.run_before_completion(ctx)
|
||||
-> [按 priority 升序] 每层 middleware.before_completion()
|
||||
-> Continue: 下一层 | Stop(reason): 中断循环
|
||||
-> [分波并行] 检测连续 parallel_safe 中间件
|
||||
-> Wave 并行 (2+ safe): tokio::spawn 各自 ctx.clone() → 合并 prompt
|
||||
-> 串行 (unsafe / 单个 safe): 逐个执行
|
||||
-> Continue: 下一层 | Stop(reason): 中断循环
|
||||
-> LLM 调用
|
||||
-> (工具调用时) chain.run_before_tool_call()
|
||||
-> Allow | Block(msg) | ReplaceInput | AbortLoop
|
||||
@@ -57,22 +60,22 @@ tags: [module, middleware, runtime]
|
||||
|
||||
### 14 层 Runtime 中间件
|
||||
|
||||
| 优先级 | 中间件 | 文件 | 职责 | 注册条件 |
|
||||
|--------|--------|------|------|----------|
|
||||
| @78 | EvolutionMiddleware | `evolution.rs` | 推送进化候选项到 system prompt | 始终 |
|
||||
| @80 | ButlerRouter | `butler_router.rs` | 语义技能路由 + system prompt 增强 + XML fencing | 始终 |
|
||||
| @100 | Compaction | `compaction.rs` | 超阈值时压缩对话历史 | `compaction_threshold > 0` |
|
||||
| @150 | Memory | `memory.rs` | 对话后自动提取记忆 + 注入检索结果 | 始终 |
|
||||
| @180 | Title | `title.rs` | 自动生成会话标题 | 始终 |
|
||||
| @200 | SkillIndex | `skill_index.rs` | 注入技能索引到 system prompt | `!skill_index.is_empty()` |
|
||||
| @300 | DanglingTool | `dangling_tool.rs` | 修复缺失的工具调用结果 | 始终 |
|
||||
| @350 | ToolError | `tool_error.rs` | 格式化工具错误供 LLM 恢复 | 始终 |
|
||||
| @360 | ToolOutputGuard | `tool_output_guard.rs` | 工具输出安全检查 | 始终 |
|
||||
| @400 | Guardrail | `guardrail.rs` | shell_exec/file_write/web_fetch 安全规则 | 始终 |
|
||||
| @500 | LoopGuard | `loop_guard.rs` | 防止工具调用无限循环 | 始终 |
|
||||
| @550 | SubagentLimit | `subagent_limit.rs` | 限制并发子 agent | 始终 |
|
||||
| @650 | TrajectoryRecorder | `trajectory_recorder.rs` | 轨迹记录 + 压缩 | 始终 |
|
||||
| @700 | TokenCalibration | `token_calibration.rs` | Token 用量校准 | 始终 |
|
||||
| 优先级 | 中间件 | 文件 | 职责 | parallel_safe | 注册条件 |
|
||||
|--------|--------|------|------|---------------|----------|
|
||||
| @78 | EvolutionMiddleware | `evolution.rs` | 推送进化候选项到 system prompt | ✅ | 始终 |
|
||||
| @80 | ButlerRouter | `butler_router.rs` | 语义技能路由 + system prompt 增强 + XML fencing | ✅ | 始终 |
|
||||
| @100 | Compaction | `compaction.rs` | 超阈值时压缩对话历史 | ❌ | `compaction_threshold > 0` |
|
||||
| @150 | Memory | `memory.rs` | 对话后自动提取记忆 + 注入检索结果 | ✅ | 始终 |
|
||||
| @180 | Title | `title.rs` | 自动生成会话标题 | ✅ | 始终 |
|
||||
| @200 | SkillIndex | `skill_index.rs` | 注入技能索引到 system prompt | ✅ | `!skill_index.is_empty()` |
|
||||
| @300 | DanglingTool | `dangling_tool.rs` | 修复缺失的工具调用结果 | ❌ | 始终 |
|
||||
| @350 | ToolError | `tool_error.rs` | 格式化工具错误供 LLM 恢复 | ❌ | 始终 |
|
||||
| @360 | ToolOutputGuard | `tool_output_guard.rs` | 工具输出安全检查 | ❌ | 始终 |
|
||||
| @400 | Guardrail | `guardrail.rs` | shell_exec/file_write/web_fetch 安全规则 | ❌ | 始终 |
|
||||
| @500 | LoopGuard | `loop_guard.rs` | 防止工具调用无限循环 | ❌ | 始终 |
|
||||
| @550 | SubagentLimit | `subagent_limit.rs` | 限制并发子 agent | ❌ | 始终 |
|
||||
| @650 | TrajectoryRecorder | `trajectory_recorder.rs` | 轨迹记录 + 压缩 | ❌ | 始终 |
|
||||
| @700 | TokenCalibration | `token_calibration.rs` | Token 用量校准 | ❌ | 始终 |
|
||||
|
||||
> 注册顺序 (代码) 与执行顺序 (priority) 不同。Chain 按 priority 升序排列后执行。
|
||||
|
||||
@@ -96,6 +99,8 @@ tags: [module, middleware, runtime]
|
||||
- Priority 升序: 0-999, 数值越小越先执行
|
||||
- 注册顺序 != 执行顺序; chain 按 priority 运行时排序
|
||||
- Stop/Block/AbortLoop 立即中断, 不执行后续中间件
|
||||
- parallel_safe 中间件只修改 system_prompt,不修改 messages,不返回 Stop
|
||||
- 分波合并: 并行 wave 中每个中间件 clone context,完成后按 base_prompt_len 截取增量合并
|
||||
|
||||
### 核心接口
|
||||
|
||||
@@ -103,6 +108,7 @@ tags: [module, middleware, runtime]
|
||||
trait AgentMiddleware: Send + Sync {
|
||||
fn name(&self) -> &str;
|
||||
fn priority(&self) -> i32 { 500 }
|
||||
fn parallel_safe(&self) -> bool { false }
|
||||
async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision>;
|
||||
async fn before_tool_call(&self, ctx: &MiddlewareContext, tool_name: &str, tool_input: &Value) -> Result<ToolCallDecision>;
|
||||
async fn after_tool_call(&self, ctx: &mut MiddlewareContext, tool_name: &str, result: &Value) -> Result<()>;
|
||||
@@ -129,8 +135,8 @@ trait AgentMiddleware: Send + Sync {
|
||||
|
||||
| 日期 | 变更 | 影响 |
|
||||
|------|------|------|
|
||||
| 04-23 | 分波并行执行: parallel_safe() + wave detection + tokio::spawn | before_completion 阶段 5 层 safe 中间件可并行,延迟降低 ~1-3s |
|
||||
| 04-22 | DataMasking 中间件移除 | 14->14 层 (替换为无), 减少 1 层无收益处理 |
|
||||
| 04-22 | 跨会话记忆修复 | Memory 中间件去重+跨会话注入修复 |
|
||||
| 04-22 | Wiki 一致性校准 | 数字与代码验证对齐 |
|
||||
| 04-21 | Embedding 接通 | SkillIndex 路由 TF-IDF->Embedding+LLM fallback |
|
||||
| 04-15 | Heartbeat 统一健康系统 | TrajectoryRecorder 痛点感知增强 |
|
||||
|
||||
Reference in New Issue
Block a user