19 Commits

Author SHA1 Message Date
iven
7b0d452845 fix(tool): Windows UNC 路径规范 — PathValidator 路径比较一致性
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- with_workspace() 对 workspace_root 做 canonicalize,确保与
  resolve_and_validate 产出的 canonical 路径格式一致
- 新增 normalize_windows_path() 剥离 \?\ 前缀,解决 Windows 上
  starts_with 比较失败问题
- check_blocked/check_allowed 统一使用规范化路径比较
2026-04-24 17:02:24 +08:00
iven
855c89e8fb fix(tool): 相对路径文件写入失败 — PathValidator 先基于 workspace 解析
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
当 file_write 收到相对路径如 test_tool.txt 时,PathValidator 的
resolve_and_validate 尝试对空父目录 canonicalize 导致失败。

修复:相对路径先基于 workspace_root 解析为绝对路径,再进行安全校验。
2026-04-24 16:02:09 +08:00
iven
3eb098f020 fix(runtime): 工具调用 P1/P2/P3 全面修复
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
P1: 流式模式工具并行执行
- 三阶段执行: Phase 1 中间件预检(serial) → Phase 2 并行+串行分区 → Phase 3 结果排序
- ReadOnly 工具用 JoinSet + Semaphore(3) 并行,Exclusive/Interactive 串行
- 与非流式模式保持一致的执行策略

P2: OpenAI 驱动工具参数解析
- 解析失败不再静默替换为 {},改为返回 _parse_error + _raw_args
- 让 LLM 和工具能感知参数问题并自我修正

P2: ToolOutputGuard 精确匹配
- 从 to_lowercase() 关键词匹配改为 regex 精确匹配实际密钥值
- 检测 sk-xxx(20+), AKIA(16), PEM 私钥, key=value 模式
- 移除 "system:", "you are now" 等过于宽泛的注入检测
- 消除合法内容包含 "password" 等词汇时的误拦

P2: ToolErrorMiddleware per-session 计数
- 从全局 AtomicU32 改为 Mutex<HashMap<session_id, u32>>
- 每个会话独立跟踪连续失败次数,消除跨会话误触发 AbortLoop

P3: Gateway client onTool 回调语义
- 明确 tool_call 的 output 始终为空串 (start 信号)
- 添加注释说明 start/end 语义约定
2026-04-24 12:56:07 +08:00
iven
c12b64150b fix(runtime): 工具调用 P0 修复 — after_tool_call 接入 + stream_errored 工具抢救
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
P0-1: after_tool_call 中间件从未被调用
- 流式模式(run_streaming)和非流式模式(run)均添加 middleware_chain.run_after_tool_call()
- ToolErrorMiddleware 错误计数恢复逻辑现在生效
- ToolOutputGuardMiddleware 敏感信息检测现在生效

P0-2: stream_errored 跳过所有工具执行
- 新增 completed_tool_ids 跟踪哪些工具已收到完整 ToolUseEnd
- 流式错误时区分完整工具和不完整工具
- 完整工具照常执行(产物创建等不受影响)
- 不完整工具发送取消 ToolEnd 事件(前端不再卡"执行中")
- 工具执行后若 stream_errored,break outer 阻止无效 LLM 循环

参考文档:
- docs/references/zclaw-toolcall-issues.md (10项问题分析)
- docs/references/deerflow-toolcall-reference.md (DeerFlow工具调用完整参考)
2026-04-24 12:20:14 +08:00
iven
4c31471cd6 feat(artifact): 产物系统优化 — 共享渲染 + 数据源扩展 + 持久化
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- MarkdownRenderer: 从 StreamingText 提取共享 react-markdown + remark-gfm 组件
- ArtifactPanel: 替换手写 MarkdownPreview 为完整 GFM 渲染,添加文件选择器下拉菜单
- 数据源: file_write/str_replace 双工具 + sendMessage/initStreamListener 双路径
- 持久化: artifactStore 添加 zustand persist + IndexedDB (复用 idb-storage)
2026-04-24 10:59:27 +08:00
iven
b60b96225d docs(wiki): Hermes Phase 1-4 wiki 同步
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- hands-skills: 新增 concurrency() 声明要求不变量
- log: 追加 Hermes Phase 1-4 变更记录
- index: 更新日期
2026-04-24 08:54:48 +08:00
iven
06e93a21af perf(compaction): Hermes Phase 4 — debounce + async cache + iterative summary
Step 4.1: Compaction debounce
- 30s cooldown between consecutive compactions
- Minimum 3 rounds (6 messages) since last compaction before re-triggering
- AtomicU64 lock-free state tracking

Step 4.2: Async compaction with cached fallback
- During cooldown, use cached result from previous compaction
- RwLock<Option<Vec<Message>>> for thread-safe cache access
- Cache updated after each successful compaction

Step 4.3: Iterative summary
- generate_summary/generate_llm_summary accept previous_summary parameter
- LLM prompt includes previous summary for cumulative context preservation
- Rule-based summary carries forward [上轮摘要保留] section
- previous_summary extracted from leading System messages in message history
2026-04-24 08:53:37 +08:00
iven
9060935401 perf(runtime): Hermes Phase 1-3 — prompt caching + parallel tools + smart retry
Phase 1: Anthropic prompt caching
- Add cache_control ephemeral on system prompt blocks
- Track cache_creation/cache_read tokens in CompletionResponse + StreamChunk

Phase 2A: Parallel tool execution
- Add ToolConcurrency enum (ReadOnly/Exclusive/Interactive)
- JoinSet + Semaphore(3) for bounded parallel tool calls
- 7 tools annotated with correct concurrency level
- AtomicU32 for lock-free failure tracking in ToolErrorMiddleware

Phase 2B: Tool output pruning
- prune_tool_outputs() trims old ToolResult > 2000 chars to 500 chars
- Integrated into CompactionMiddleware before token estimation

Phase 3: Error classification + smart retry
- LlmErrorKind + ClassifiedLlmError for structured error mapping
- RetryDriver decorator with jittered exponential backoff
- Kernel wraps all LLM calls with RetryDriver
- CONTEXT_OVERFLOW recovery triggers emergency compaction in loop_runner
2026-04-24 08:39:56 +08:00
iven
6d6673bf5b fix(suggest): 建议默认使用中文,不混入英文词汇
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
规则 7 从"使用与用户相同的语言"改为明确要求中文优先,
英文术语需翻译(如 workflow→工作流)。
示例同步更新为纯中文表达。
2026-04-24 00:01:22 +08:00
iven
15f84bf8c1 fix(suggest): 建议芯片去掉称谓,避免用户发送时角色错位
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
suggestion prompt 新增规则:建议会被用户直接点击发送,
因此不包含"领导/老板/老师"等称谓,改用无主语句式。
同步更新示例和关怀模板中的表达方式。
2026-04-23 23:53:07 +08:00
iven
9a313e3c92 docs(wiki): 回复效率+建议并行化优化 wiki 同步
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- middleware.md: 分波并行执行设计决策 + parallel_safe 标注 + 不变量 + 执行流
- chat.md: suggestion prefetch + 解耦 memory + prompt 重写
- log.md: 追加变更记录
- CLAUDE.md: §13 架构快照 + 最近变更
2026-04-23 23:45:28 +08:00
iven
ee5611a2f8 perf(middleware): before_completion 分波并行执行
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- MiddlewareContext 加 Clone derive, 支持并行克隆上下文
- AgentMiddleware trait 新增 parallel_safe() 默认方法 (false)
- MiddlewareChain::run_before_completion 改为分波执行:
  连续 2+ 个 parallel_safe 中间件用 tokio::spawn 并发执行,
  各自独立修改 system_prompt, 执行完成后合并贡献
- 5 个只修改 system_prompt 的中间件标记 parallel_safe:
  evolution(P78), butler_router(P80), memory(P150),
  title(P180), skill_index(P200)
- 非 parallel_safe 中间件 (compaction, dangling_tool 等) 保持串行

分波效果:
  Wave 1: evolution + butler_router → 并行 (省 ~0.5-1s)
  Wave 2: compaction → 串行 (可能修改 messages)
  Wave 3: memory + title + skill_index → 并行 (省 ~0.5-2s)
  Wave 4+: 工具/安全中间件 → 串行
2026-04-23 23:37:57 +08:00
iven
5cf7adff69 perf(chat): 回复效率 + 建议生成并行化优化
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- identity prompt 缓存: LazyLock<RwLock<HashMap>> 缓存已构建的 identity prompt,
  soul.md 更新时自动失效, 省去每次请求的 mutex + 磁盘 I/O (~0.5-1s)
- pre-conversation hook 并行化: tokio::join! 并行执行 identity build 和
  continuity context 查询, 不再串行等待 (~1-2s)
- suggestion context 预取: 流式回复期间提前启动 fetchSuggestionContext,
  回复结束时 context 已就绪 (~0.5-1s)
- 建议生成与 memory extraction 解耦: generateLLMSuggestions 不再等待
  memory extraction LLM 调用完成, 独立启动 (~3-8s)
- Path B (agent stream) 补全 context: lifecycle:end 路径使用预取 context,
  修复零个性化问题
- 上下文窗口扩展: slice(-6) → slice(-20), 每条截断 200 字符
- suggestion prompt 重写: 1 深入追问 + 1 实用行动 + 1 管家关怀,
  明确角色定位, 禁止空泛建议
2026-04-23 23:13:20 +08:00
iven
10497362bb fix(chat): 澄清问题卡片 UX 优化 — 去悬空引用 + 默认展开
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- 提示词增加 ask_clarification 引用规则,避免 LLM 在文本中生成
  "以下信息"/"比如:"等悬空引用短语
- 新增 stripDanglingClarificationRef 前端安全网,当消息包含
  ask_clarification 工具调用时自动移除末尾悬空引用
- 澄清卡片默认展开,让用户直接看到选项无需额外点击
2026-04-23 19:21:10 +08:00
iven
d7dbdf8600 docs(wiki): 动态建议智能化变更日志
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
2026-04-23 18:01:44 +08:00
iven
8c25b20fe2 feat(suggest): 更新 suggestion prompt 为混合型(2续问+1管家关怀)
- llm-service.ts: HARDCODED_PROMPTS.suggestions.system 改为混合型
  - 2条对话续问 + 1条管家关怀(痛点回访/经验复用/技能推荐)
- streamStore.ts: LLM_PROMPTS_SYSTEM 改为引用 llm-service 导出
  - 单一真相源,OTA 更新时自动生效
2026-04-23 17:58:58 +08:00
iven
87110ffdff feat(suggest): 改造 createCompleteHandler 并行化 + generateLLMSuggestions 增强
- createCompleteHandler: 记忆提取+上下文拉取 Promise.all 并行
- generateLLMSuggestions: 新增 SuggestionContext 参数,构建增强 user message
- llmSuggestViaSaaS: 删除 2s 人为延迟(并行化后不再需要)
- 变量重命名 context→conversationContext 避免与 SuggestionContext 冲突
2026-04-23 17:57:17 +08:00
iven
980a8135fa feat(suggest): 新增 fetchSuggestionContext 聚合函数 + 类型定义
- 4 路并行拉取智能上下文:用户画像、痛点、经验、技能匹配
- 500ms 超时保护 + 静默降级(失败不阻断建议生成)
- Tauri 不可用时直接返回空上下文
2026-04-23 17:54:57 +08:00
iven
e9e7ffd609 feat(intelligence): 新增 experience_find_relevant Tauri 命令 + ExperienceBrief
- 新增 ExperienceBrief 结构(痛点模式+方案摘要+复用次数)
- OnceLock 单例 + init_experience_extractor() 启动初始化
- experience_find_relevant 命令:按 agent_id + query 检索相关经验
- 注册到 invoke_handler + setup 阶段优雅降级初始化
- 新增序列化测试(10 tests PASS)
2026-04-23 17:52:33 +08:00
56 changed files with 2838 additions and 755 deletions

178
CLAUDE.md
View File

@@ -165,10 +165,25 @@ desktop/src-tauri (→ kernel, skills, hands, protocols)
2. **自动验证**`cargo check` / `cargo test` / `tsc --noEmit` / `vitest run` 必须通过 2. **自动验证**`cargo check` / `cargo test` / `tsc --noEmit` / `vitest run` 必须通过
3. **回归测试** — 跑受影响 crate 的全量测试,确认无回归 3. **回归测试** — 跑受影响 crate 的全量测试,确认无回归
#### 阶段 4: 提交 + 同步(立即,不积压) #### 阶段 4: Wiki 同步 + 提交(立即,不积压)
1. **提交推送** — 按 §11 规范提交,**立即 `git push`** **Wiki 同步评估(硬门槛,不可跳过)**
2. **文档同步** — 按 §8.3 检查并更新相关文档,提交并推送
代码改完后、提交前,逐条回答以下问题。任何一条为"是"→ 必须更新对应 wiki 页面:
| 评估问题 | 为"是"时更新 |
|----------|-------------|
| 这个改动修复或引入了 bug | 对应模块页"活跃问题+陷阱"节 + `wiki/known-issues.md` |
| 这个改动改变了某个模块的行为或设计理由? | 对应模块页"设计决策"节 |
| 这个改动增删了文件或改变了目录结构? | 对应模块页"关键文件"表 |
| 这个改动影响了跨模块接口(谁调谁、参数形状、触发时机)? | 涉及双方的"集成契约"表 |
| 这个改动涉及一个必须始终成立的约束? | 对应模块页"代码逻辑"节的 ⚡ 不变量 |
| 这个改动改变了功能链路(前端→后端的完整路径)? | `wiki/feature-map.md` 索引表 |
| 这个改动改变了关键数字(命令数/Store数/测试数等)? | `wiki/index.md` 关键数字表 + `docs/TRUTH.md` |
全部回答完后,无论是否有更新,都追加一条到 `wiki/log.md` + 更新模块页"变更记录"节(保持 5 条)。
**提交推送** — 按 §11 规范提交,**立即 `git push`**。详细文档同步规则见 §8.3。
**铁律:不允许"等一下再提交"或"最后一起推送"。每个独立工作单元完成后立即推送。** **铁律:不允许"等一下再提交"或"最后一起推送"。每个独立工作单元完成后立即推送。**
@@ -374,34 +389,44 @@ docs/
每次完成功能实现、架构变更、问题修复后,**必须立即执行以下收尾** 每次完成功能实现、架构变更、问题修复后,**必须立即执行以下收尾**
#### 步骤 A文档同步(代码提交前) #### 步骤 AWiki 同步(最高优先,代码提交前)
检查以下文档是否需要更新,有变更则立即修改: > **为什么 wiki 排第一**wiki 是新 AI 会话的启动燃料。如果 wiki 与代码不一致,后续所有会话都会基于错误上下文工作,错误会积累放大。
在 §3.3 阶段 4 的评估表基础上,执行具体更新:
| 触发事件 | 更新目标 | 更新内容 |
|----------|---------|---------|
| 修复 bug | 对应模块页"活跃问题+陷阱" | 修复→移除条目;新增→添加条目 |
| 架构/设计变更 | 对应模块页"设计决策" | WHY 变了 + 新的权衡取舍 |
| 文件增删/移动 | 对应模块页"关键文件"表 | 更新文件列表 |
| 跨模块接口变化 | **涉及双方**的"集成契约"表 | 方向/接口/触发时机 |
| 发现新的不变量 | 对应模块页"代码逻辑"节 | ⚡ 标记 + 一句话描述 |
| 功能链路变化 | `wiki/feature-map.md` | 更新索引表对应行 |
| 关键数字变化 | `wiki/index.md` + `docs/TRUTH.md` | 更新数字 + 验证命令 |
| **每次收尾** | `wiki/log.md` + 模块页"变更记录" | 追加日志条目 + 变更记录保持 5 条 |
**wiki 更新原则**
- 只记录代码不能告诉你的东西WHY、跨模块关系、不变量、历史教训
- 模块页控制在 100-200 行,超出则归档到 `wiki/archive/`
- 同一信息只出现在一个页面(单一真相源),其他页面只引用
#### 步骤 B其他文档同步
1. **CLAUDE.md** — 项目结构、技术栈、工作流程、命令变化时 1. **CLAUDE.md** — 项目结构、技术栈、工作流程、命令变化时
2. **CLAUDE.md §13 架构快照** — 涉及子系统变更时,更新 `<!-- ARCH-SNAPSHOT-START/END -->` 标记区域(可执行 `/sync-arch` 技能自动分析) 2. **CLAUDE.md §13 架构快照** — 涉及子系统变更时(可执行 `/sync-arch` 技能自动分析)
3. **docs/ARCHITECTURE_BRIEF.md** — 架构决策或关键组件变更时 3. **docs/ARCHITECTURE_BRIEF.md** — 架构决策或关键组件变更时
4. **docs/features/** — 功能状态变化时 4. **docs/features/** — 功能状态变化时
5. **docs/knowledge-base/** — 新的排查经验或配置说明 5. **docs/knowledge-base/** — 新的排查经验或配置说明
6. **wiki/** — 编译后知识库维护(按触发规则更新对应页面,每页统一 5 节: 设计决策 / 关键文件+集成契约 / 代码逻辑 / 活跃问题+陷阱 / 变更记录):
- 修复 bug → 更新对应模块页"活跃问题"节 + `wiki/known-issues.md` 索引
- 架构变更 → 更新对应模块页"设计决策"节
- 文件结构变化 → 更新对应模块页"关键文件"表
- 跨模块接口变化 → 更新对应模块页"集成契约"表
- 新增不变量发现 → 更新对应模块页"代码逻辑"节的 ⚡ 标记项
- 功能链路变化 → 更新 `wiki/feature-map.md` 索引表
- 数字变化 → 更新 `wiki/index.md` 关键数字表 + `docs/TRUTH.md`
- 每次更新 → 在 `wiki/log.md` 追加一条记录 + 模块页"变更记录"节更新最近 5 条
6. **docs/TRUTH.md** — 数字命令数、Store 数、crates 数等)变化时
#### 步骤 B:提交(按逻辑分组) #### 步骤 C:提交(按逻辑分组)
``` ```
代码变更 → 一个或多个逻辑提交 代码变更 → 一个或多个逻辑提交
文档变更 → 独立提交(如果和代码分开更清晰) 文档变更 → 独立提交(如果和代码分开更清晰)
``` ```
#### 步骤 C:推送(立即) #### 步骤 D:推送(立即)
``` ```
git push git push
@@ -559,7 +584,7 @@ refactor(store): 统一 Store 数据获取方式
*** ***
<!-- ARCH-SNAPSHOT-START --> <!-- ARCH-SNAPSHOT-START -->
<!-- 此区域由 auto-sync 自动更新,请勿手动编辑。更新时间: 2026-04-15 --> <!-- 此区域由 auto-sync 自动更新,请勿手动编辑。更新时间: 2026-04-23 -->
## 13. 当前架构快照 ## 13. 当前架构快照
@@ -567,51 +592,53 @@ refactor(store): 统一 Store 数据获取方式
| 子系统 | 状态 | 最新变更 | | 子系统 | 状态 | 最新变更 |
|--------|------|----------| |--------|------|----------|
| 管家模式 (Butler) | ✅ 活跃 | 04-12 行业配置4行业 + 跨会话连续性 + <butler-context> XML fencing | | 管家模式 (Butler) | ✅ 活跃 | 04-23 跨会话身份(soul.md) + 动态建议(4路并行LLM驱动) + Agent tab 移除 |
| Hermes 管线 | ✅ 活跃 | 04-12 触发信号持久化 + 经验行业维度 + 注入格式优化 | | Hermes 管线 | ✅ 活跃 | 04-23 experience_find_relevant Tauri 命令 + ExperienceBrief + OnceLock 单例 |
| Intelligence Heartbeat | ✅ 活跃 | 04-15 统一健康快照 (health_snapshot.rs) + HeartbeatManager 重构 + HealthPanel 前端 | | Intelligence Heartbeat | ✅ 活跃 | 04-15 统一健康快照 (health_snapshot.rs) + HeartbeatManager 重构 + HealthPanel 前端 |
| 聊天流 (ChatStream) | ✅ 稳定 | 04-02 ChatStore 拆分为 4 Store (stream/conversation/message/chat) | | 聊天流 (ChatStream) | ✅ 活跃 | 04-23 LLM 动态建议(替换硬编码) + 澄清卡片 UX 优化 |
| 记忆管道 (Memory) | ✅ 稳定 | 04-17 E2E 验证: 存储+FTS5+TF-IDF+注入闭环,去重+跨会话注入已修复 | | 记忆管道 (Memory) | ✅ 活跃 | 04-23 身份信号提取(agent_name/user_name) + ProfileSignals 增强 |
| SaaS 认证 (Auth) | ✅ 稳定 | Token池 RPM/TPM 轮换 + JWT password_version 失效机制 | | SaaS 认证 (Auth) | ✅ 稳定 | Token池 RPM/TPM 轮换 + JWT password_version 失效机制 |
| Pipeline DSL | ✅ 稳定 | 04-01 17 个 YAML 模板 + DAG 执行器 | | Pipeline DSL | ✅ 稳定 | 04-01 18 个 YAML 模板 + DAG 执行器 |
| Hands 系统 | ✅ 稳定 | 7 注册 (6 HAND.toml + _reminder)Whiteboard/Slideshow/Speech 开发中 | | Hands 系统 | ✅ 稳定 | 7 注册 (6 HAND.toml + _reminder)Whiteboard/Slideshow/Speech 已删除 |
| 技能系统 (Skills) | ✅ 稳定 | 75 个 SKILL.md + 语义路由 | | 技能系统 (Skills) | ✅ 稳定 | 75 个 SKILL.md + 语义路由 |
| 中间件链 | ✅ 稳定 | 13(ButlerRouter@80, Compaction@100, Memory@150, Title@180, SkillIndex@200, DanglingTool@300, ToolError@350, ToolOutputGuard@360, Guardrail@400, LoopGuard@500, SubagentLimit@550, TrajectoryRecorder@650, TokenCalibration@700) | | 中间件链 | ✅ 稳定 | 14+ 分波并行 (Evolution@78✅, ButlerRouter@80, Compaction@100, Memory@150, Title@180, SkillIndex@200, DanglingTool@300, ToolError@350, ToolOutputGuard@360, Guardrail@400, LoopGuard@500, SubagentLimit@550, TrajectoryRecorder@650, TokenCalibration@700) — ✅=parallel_safe |
### 关键架构模式 ### 关键架构模式
- **Hermes 管线**: 4模块闭环 — ExperienceStore(FTS5经验存取) + UserProfiler(结构化用户画像) + NlScheduleParser(中文时间→cron) + TrajectoryRecorder+Compressor(轨迹记录压缩)。通过中间件链+intelligence hooks调用 - **Hermes 管线**: 4模块闭环 — ExperienceStore(FTS5经验存取) + UserProfiler(结构化用户画像) + NlScheduleParser(中文时间→cron) + TrajectoryRecorder+Compressor(轨迹记录压缩)。通过中间件链+intelligence hooks调用
- **管家模式**: 双模式UI (默认简洁/解锁专业) + ButlerRouter 动态行业关键词(4内置+自定义) + <butler-context> XML fencing注入 + 跨会话连续性(痛点回访+经验检索) + 触发信号持久化(VikingStorage) + 冷启动4阶段hook - **管家模式**: 双模式UI (默认简洁/解锁专业) + ButlerRouter 动态行业关键词(4内置+自定义) + <butler-context> XML fencing注入 + 跨会话连续性(痛点回访+经验检索) + 触发信号持久化(VikingStorage) + 冷启动4阶段hook + 跨会话身份(soul.md) + 动态建议(4路并行LLM驱动2续问+1关怀)
- **聊天流**: 3种实现 → GatewayClient(WebSocket) / KernelClient(Tauri Event) / SaaSRelay(SSE) + 5min超时守护。详见 [ARCHITECTURE_BRIEF.md](docs/ARCHITECTURE_BRIEF.md) - **聊天流**: 3种实现 → GatewayClient(WebSocket) / KernelClient(Tauri Event) / SaaSRelay(SSE) + 5min超时守护。动态建议: prefetch context + generateLLMSuggestions(1追问+1行动+1关怀) 与 memory extraction 解耦。详见 [ARCHITECTURE_BRIEF.md](docs/ARCHITECTURE_BRIEF.md)
- **客户端路由**: `getClient()` 4分支决策树 → Admin路由 / SaaS Relay(可降级到本地) / Local Kernel / External Gateway - **客户端路由**: `getClient()` 4分支决策树 → Admin路由 / SaaS Relay(可降级到本地) / Local Kernel / External Gateway
- **SaaS 认证**: JWT→OS keyring 存储 + HttpOnly cookie + Token池 RPM/TPM 限流轮换 + SaaS unreachable 自动降级 - **SaaS 认证**: JWT→OS keyring 存储 + HttpOnly cookie + Token池 RPM/TPM 限流轮换 + SaaS unreachable 自动降级
- **记忆闭环**: 对话→extraction_adapter→FTS5全文+TF-IDF权重→检索→注入系统提示E2E 04-17 验证通过,去重+跨会话注入已修复) - **记忆闭环**: 对话→extraction_adapter→FTS5全文+TF-IDF权重→检索→注入系统提示 + 身份信号提取(agent_name/user_name)→VikingStorage→soul.md→跨会话名字记忆
- **LLM 驱动**: 4 Rust Driver (Anthropic/OpenAI/Gemini/Local) + 国内兼容 (DeepSeek/Qwen/Moonshot 通过 base_url) - **LLM 驱动**: 4 Rust Driver (Anthropic/OpenAI/Gemini/Local) + 国内兼容 (DeepSeek/Qwen/Moonshot 通过 base_url)
### 最近变更 ### 最近变更
1. [04-21] Embedding 接通 + 自学习自动化 A线+B线: 记忆检索Embedding(GrowthIntegration→MemoryRetriever→SemanticScorer) + Skill路由Embedding+LLM Fallback(替换new_tf_idf_only) + evolution_bridge(SkillCandidate→SkillManifest) + generate_and_register_skill()全链路 + EvolutionMiddleware双模式(auto/suggest) + QualityGate加固(长度/标题/置信度上限)。验证: 934 tests PASS 1. [04-23] 回复效率+建议生成并行化: identity prompt 缓存 + pre-hook 并行(tokio::join!) + middleware 分波并行(parallel_safe, 5层✅) + suggestion context 预取 + 建议与 memory 解耦 + prompt 重写(1追问+1行动+1关怀)
2. [04-21] Phase 0+1 突破之路 8 项基础链路修复: 经验积累覆盖修复(reuse_count累积) + Skill工具调用桥接(complete_with_tools) + Hand字段映射(runId) + Heartbeat痛点感知 + Browser委托消息 + 跨会话检索增强(IdentityRecall 26→43模式+弱身份fallback) + Twitter凭据持久化。验证: 912 tests PASS 2. [04-23] 动态建议智能化: fetchSuggestionContext 4路并行(用户画像/痛点/经验/技能匹配) + generateLLMSuggestions 混合型 prompt (2续问+1管家关怀) + experience_find_relevant Tauri 命令 + ExperienceBrief
2. [04-17] 全系统 E2E 测试 129 链路: 82 PASS / 20 PARTIAL / 1 FAIL / 26 SKIP有效通过率 79.1%。7 项 Bug 修复 (Dashboard 404/记忆去重/记忆注入/invoice_id/Prompt版本/agent隔离/行业字段) 3. [04-23] 跨会话身份: detectAgentNameSuggestion trigger+extract 两步法(10 trigger) + ProfileSignals agent_name/user_name + soul.md 写回 + Agent tab 移除 (~280 行 dead code 清理)
2. [04-16] 3 项 P0 修复 + 5 项 E2E Bug 修复 + Agent 面板刷新 + TRUTH.md 数字校准 4. [04-22] Wiki 全面重构: 5节模板+集成契约+症状导航+归档压缩,净减 ~1,200 行
3. [04-15] Heartbeat 统一健康系统: health_snapshot.rs 统一收集器(LLM连接/记忆/会话/系统资源) + heartbeat.rs HeartbeatManager 重构 + HealthPanel.tsx 前端面板 + Tauri 命令 182→183 + intelligence 模块 15→16 文件 + 删除 intelligence-client/ 9 废弃文件 4. [04-22] 跨会话记忆断裂修复 + DataMasking 中间件移除 + 搜索功能修复(多引擎+质量过滤+SSE行缓冲)
4. [04-12] 行业配置+管家主动性 全栈 5 Phase: 行业数据模型+4内置配置+ButlerRouter动态关键词+触发信号+Tauri加载+Admin管理页面+跨会话连续性+XML fencing注入格式 5. [04-21] Embedding 接通 + 自学习自动化 A线+B线 + Phase 0+1 突破之路 8 项链路修复。验证: 934 tests PASS
5. [04-09] Hermes Intelligence Pipeline 4 Chunk: ExperienceStore+Extractor, UserProfileStore+Profiler, NlScheduleParser, TrajectoryRecorder+Compressor (684 tests, 0 failed) 6. [04-20] 50 轮功能链路审计 7 项断链修复 (42/50 = 84% 通过率)
6. [04-09] 管家模式6交付物完成: ButlerRouter + 冷启动 + 简洁模式UI + 桥测试 + 发布文档 7. [04-17] 全系统 E2E 测试 129 链路: 82 PASS / 20 PARTIAL / 1 FAIL / 26 SKIP有效通过率 79.1%
<!-- ARCH-SNAPSHOT-END -->
<!-- ARCH-SNAPSHOT-END --> <!-- ARCH-SNAPSHOT-END -->
<!-- ANTI-PATTERN-START --> <!-- ANTI-PATTERN-START -->
<!-- 此区域由 auto-sync 自动更新,请勿手动编辑。更新时间: 2026-04-09 --> <!-- 此区域由 auto-sync 自动更新,请勿手动编辑。更新时间: 2026-04-23 -->
## 14. AI 协作注意事项 ## 14. AI 协作注意事项
### 反模式警告 ### 反模式警告
- ❌ **不要**建议新增 SaaS API 端点 — 已有 140 个,稳定化约束禁止新增 - ❌ **不要**建议新增 SaaS API 端点 — 已有 137 个,稳定化约束禁止新增
- ❌ **不要**忽略管家模式 — 已上线且为默认模式,所有聊天经过 ButlerRouter - ❌ **不要**忽略管家模式 — 已上线且为默认模式,所有聊天经过 ButlerRouter
- ❌ **不要**假设 Tauri 直连 LLM — 实际通过 SaaS Token 池中转SaaS unreachable 时降级到本地 Kernel - ❌ **不要**假设 Tauri 直连 LLM — 实际通过 SaaS Token 池中转SaaS unreachable 时降级到本地 Kernel
- ❌ **不要**建议从零实现已有能力 — 先查 Hand(9个)/Skill(75个)/Pipeline(17模板) 现有库 - ❌ **不要**建议从零实现已有能力 — 先查 Hand(7注册)/Skill(75个)/Pipeline(18模板) 现有库
- ❌ **不要**在 CLAUDE.md 以外创建项目级配置或规则文件 — 单一入口原则 - ❌ **不要**在 CLAUDE.md 以外创建项目级配置或规则文件 — 单一入口原则
### 场景化指令 ### 场景化指令
@@ -620,6 +647,75 @@ refactor(store): 统一 Store 数据获取方式
- 当遇到**认证相关** → 记住 Tauri 模式用 OS keyring 存 JWTSaaS 模式用 HttpOnly cookie - 当遇到**认证相关** → 记住 Tauri 模式用 OS keyring 存 JWTSaaS 模式用 HttpOnly cookie
- 当遇到**新功能建议** → 先查 [TRUTH.md](docs/TRUTH.md) 确认可用能力清单,避免重复建设 - 当遇到**新功能建议** → 先查 [TRUTH.md](docs/TRUTH.md) 确认可用能力清单,避免重复建设
- 当遇到**记忆/上下文相关** → 记住闭环已接通: FTS5+TF-IDF+embedding不是空壳 - 当遇到**记忆/上下文相关** → 记住闭环已接通: FTS5+TF-IDF+embedding不是空壳
- 当遇到**管家/Butler** → 管家模式是默认模式ButlerRouter 在中间件链中做关键词分类+system prompt 增强 - 当遇到**管家/Butler** → 管家模式是默认模式ButlerRouter 在中间件链中做关键词分类+system prompt 增强。跨会话身份走 soul.md动态建议走 4 路并行上下文+LLM
<!-- ANTI-PATTERN-END --> <!-- ANTI-PATTERN-END -->
***
## 15. Karpathy 编码原则
> 源自 Andrej Karpathy 对 LLM 编码问题的观察。偏向谨慎而非速度,简单任务可灵活判断。
### 15.1 Think Before Coding
**Don't assume. Don't hide confusion. Surface tradeoffs.**
- State assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them — don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.
### 15.2 Simplicity First
**Minimum code that solves the problem. Nothing speculative.**
- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that wasn't requested.
- No error handling for impossible scenarios.
- If you write 200 lines and it could be 50, rewrite it.
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
### 15.3 Surgical Changes
**Touch only what you must. Clean up only your own mess.**
When editing existing code:
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken.
- Match existing style, even if you'd do it differently.
- If you notice unrelated dead code, mention it — don't delete it.
When your changes create orphans:
- Remove imports/variables/functions that YOUR changes made unused.
- Don't remove pre-existing dead code unless asked.
The test: Every changed line should trace directly to the user's request.
### 15.4 Goal-Driven Execution
**Define success criteria. Loop until verified.**
Transform tasks into verifiable goals:
- "Add validation" → "Write tests for invalid inputs, then make them pass"
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
- "Refactor X" → "Ensure tests pass before and after"
For multi-step tasks, state a brief plan:
```
1. [Step] → verify: [check]
2. [Step] → verify: [check]
3. [Step] → verify: [check]
```
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
---
**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.

View File

@@ -117,7 +117,9 @@ impl Kernel {
} }
} }
use zclaw_runtime::{AgentLoop, tool::builtin::PathValidator}; use std::sync::Arc;
use zclaw_runtime::{AgentLoop, LlmDriver, tool::builtin::PathValidator};
use zclaw_runtime::driver::{RetryDriver, RetryConfig};
use super::Kernel; use super::Kernel;
use super::super::MessageResponse; use super::super::MessageResponse;
@@ -161,9 +163,12 @@ impl Kernel {
let subagent_enabled = chat_mode.as_ref().and_then(|m| m.subagent_enabled).unwrap_or(false); let subagent_enabled = chat_mode.as_ref().and_then(|m| m.subagent_enabled).unwrap_or(false);
let tools = self.create_tool_registry(subagent_enabled); let tools = self.create_tool_registry(subagent_enabled);
self.skill_executor.set_tool_registry(tools.clone()); self.skill_executor.set_tool_registry(tools.clone());
let driver: Arc<dyn LlmDriver> = Arc::new(
RetryDriver::new(self.driver.clone(), RetryConfig::default())
);
let mut loop_runner = AgentLoop::new( let mut loop_runner = AgentLoop::new(
*agent_id, *agent_id,
self.driver.clone(), driver,
tools, tools,
self.memory.clone(), self.memory.clone(),
) )
@@ -275,9 +280,12 @@ impl Kernel {
let subagent_enabled = chat_mode.as_ref().and_then(|m| m.subagent_enabled).unwrap_or(false); let subagent_enabled = chat_mode.as_ref().and_then(|m| m.subagent_enabled).unwrap_or(false);
let tools = self.create_tool_registry(subagent_enabled); let tools = self.create_tool_registry(subagent_enabled);
self.skill_executor.set_tool_registry(tools.clone()); self.skill_executor.set_tool_registry(tools.clone());
let driver: Arc<dyn LlmDriver> = Arc::new(
RetryDriver::new(self.driver.clone(), RetryConfig::default())
);
let mut loop_runner = AgentLoop::new( let mut loop_runner = AgentLoop::new(
*agent_id, *agent_id,
self.driver.clone(), driver,
tools, tools,
self.memory.clone(), self.memory.clone(),
) )
@@ -426,6 +434,7 @@ impl Kernel {
prompt.push_str("- Provide clear options when possible\n"); prompt.push_str("- Provide clear options when possible\n");
prompt.push_str("- Include brief context about why you're asking\n"); prompt.push_str("- Include brief context about why you're asking\n");
prompt.push_str("- After receiving clarification, proceed immediately\n"); prompt.push_str("- After receiving clarification, proceed immediately\n");
prompt.push_str("- CRITICAL: When calling ask_clarification, do NOT repeat the options in your text response. The options will be shown in a dedicated card above your reply. Simply greet the user and briefly explain why you need clarification — avoid phrases like \"以下信息\" or \"the following options\" that imply a list follows in your text\n");
prompt prompt
} }

View File

@@ -31,6 +31,8 @@ async fn seam_hand_tool_routing() {
input_tokens: 10, input_tokens: 10,
output_tokens: 20, output_tokens: 20,
stop_reason: "tool_use".to_string(), stop_reason: "tool_use".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
]) ])
// Second stream: final text after tool executes // Second stream: final text after tool executes
@@ -40,6 +42,8 @@ async fn seam_hand_tool_routing() {
input_tokens: 10, input_tokens: 10,
output_tokens: 5, output_tokens: 5,
stop_reason: "end_turn".to_string(), stop_reason: "end_turn".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
]); ]);
@@ -105,6 +109,8 @@ async fn seam_hand_execution_callback() {
input_tokens: 10, input_tokens: 10,
output_tokens: 5, output_tokens: 5,
stop_reason: "tool_use".to_string(), stop_reason: "tool_use".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
]) ])
.with_stream_chunks(vec![ .with_stream_chunks(vec![
@@ -113,6 +119,8 @@ async fn seam_hand_execution_callback() {
input_tokens: 5, input_tokens: 5,
output_tokens: 1, output_tokens: 1,
stop_reason: "end_turn".to_string(), stop_reason: "end_turn".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
]); ]);
@@ -173,6 +181,8 @@ async fn seam_generic_tool_routing() {
input_tokens: 10, input_tokens: 10,
output_tokens: 5, output_tokens: 5,
stop_reason: "tool_use".to_string(), stop_reason: "tool_use".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
]) ])
.with_stream_chunks(vec![ .with_stream_chunks(vec![
@@ -181,6 +191,8 @@ async fn seam_generic_tool_routing() {
input_tokens: 5, input_tokens: 5,
output_tokens: 3, output_tokens: 3,
stop_reason: "end_turn".to_string(), stop_reason: "end_turn".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
]); ]);

View File

@@ -27,6 +27,8 @@ async fn smoke_hands_full_lifecycle() {
input_tokens: 15, input_tokens: 15,
output_tokens: 10, output_tokens: 10,
stop_reason: "tool_use".to_string(), stop_reason: "tool_use".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
]) ])
// After hand_quiz returns, LLM generates final response // After hand_quiz returns, LLM generates final response
@@ -36,6 +38,8 @@ async fn smoke_hands_full_lifecycle() {
input_tokens: 20, input_tokens: 20,
output_tokens: 5, output_tokens: 5,
stop_reason: "end_turn".to_string(), stop_reason: "end_turn".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
]); ]);

View File

@@ -14,6 +14,7 @@
use std::sync::Arc; use std::sync::Arc;
use std::sync::atomic::{AtomicU64, Ordering}; use std::sync::atomic::{AtomicU64, Ordering};
use serde_json::Value;
use zclaw_types::{AgentId, Message, SessionId}; use zclaw_types::{AgentId, Message, SessionId};
use crate::driver::{CompletionRequest, ContentBlock, LlmDriver}; use crate::driver::{CompletionRequest, ContentBlock, LlmDriver};
@@ -136,7 +137,7 @@ pub fn update_calibration(estimated: usize, actual: u32) {
} }
/// Estimate total tokens for messages with calibration applied. /// Estimate total tokens for messages with calibration applied.
fn estimate_messages_tokens_calibrated(messages: &[Message]) -> usize { pub fn estimate_messages_tokens_calibrated(messages: &[Message]) -> usize {
let raw = estimate_messages_tokens(messages); let raw = estimate_messages_tokens(messages);
let factor = get_calibration_factor(); let factor = get_calibration_factor();
if (factor - 1.0).abs() < f64::EPSILON { if (factor - 1.0).abs() < f64::EPSILON {
@@ -178,7 +179,7 @@ pub fn compact_messages(messages: Vec<Message>, keep_recent: usize) -> (Vec<Mess
let old_messages = &messages[..split_index]; let old_messages = &messages[..split_index];
let recent_messages = &messages[split_index..]; let recent_messages = &messages[split_index..];
let summary = generate_summary(old_messages); let summary = generate_summary(old_messages, None);
let removed_count = old_messages.len(); let removed_count = old_messages.len();
let mut compacted = Vec::with_capacity(1 + recent_messages.len()); let mut compacted = Vec::with_capacity(1 + recent_messages.len());
@@ -188,6 +189,38 @@ pub fn compact_messages(messages: Vec<Message>, keep_recent: usize) -> (Vec<Mess
(compacted, removed_count) (compacted, removed_count)
} }
/// Prune old tool outputs to reduce token consumption. Runs before compaction.
/// Only prunes ToolResult messages older than PRUNE_AGE_THRESHOLD messages.
const PRUNE_AGE_THRESHOLD: usize = 8;
const PRUNE_MAX_CHARS: usize = 2000;
const PRUNE_KEEP_HEAD_CHARS: usize = 500;
pub fn prune_tool_outputs(messages: &mut [Message]) -> usize {
let total = messages.len();
let mut pruned_count = 0;
for i in 0..total.saturating_sub(PRUNE_AGE_THRESHOLD) {
if let Message::ToolResult { output, is_error, .. } = &mut messages[i] {
if *is_error { continue; }
let text = match output {
Value::String(ref s) => s.clone(),
ref other => other.to_string(),
};
if text.len() <= PRUNE_MAX_CHARS { continue; }
let end = text.floor_char_boundary(PRUNE_KEEP_HEAD_CHARS.min(text.len()));
*output = serde_json::json!({
"_pruned": true,
"_original_chars": text.len(),
"head": &text[..end],
});
pruned_count += 1;
}
}
pruned_count
}
/// Check if compaction should be triggered and perform it if needed. /// Check if compaction should be triggered and perform it if needed.
/// ///
/// Returns the (possibly compacted) message list. /// Returns the (possibly compacted) message list.
@@ -315,6 +348,18 @@ pub async fn maybe_compact_with_config(
.iter() .iter()
.take_while(|m| matches!(m, Message::System { .. })) .take_while(|m| matches!(m, Message::System { .. }))
.count(); .count();
// Extract previous summary from leading system messages for iterative summarization
let previous_summary = messages.iter()
.take(leading_system_count)
.filter_map(|m| match m {
Message::System { content } if content.starts_with("[以下是之前对话的摘要]") => {
Some(content.clone())
}
_ => None,
})
.next();
let keep_from_end = DEFAULT_KEEP_RECENT let keep_from_end = DEFAULT_KEEP_RECENT
.min(messages.len().saturating_sub(leading_system_count)); .min(messages.len().saturating_sub(leading_system_count));
let split_index = messages.len().saturating_sub(keep_from_end); let split_index = messages.len().saturating_sub(keep_from_end);
@@ -333,14 +378,16 @@ pub async fn maybe_compact_with_config(
let recent_messages = &messages[split_index..]; let recent_messages = &messages[split_index..];
let removed_count = old_messages.len(); let removed_count = old_messages.len();
// Step 3: Generate summary (LLM or rule-based) // Step 3: Generate summary (LLM or rule-based), with iterative context
let prev_ref = previous_summary.as_deref();
let summary = if config.use_llm { let summary = if config.use_llm {
if let Some(driver) = driver { if let Some(driver) = driver {
match generate_llm_summary(driver, old_messages, config.summary_max_tokens).await { match generate_llm_summary(driver, old_messages, prev_ref, config.summary_max_tokens).await {
Ok(llm_summary) => { Ok(llm_summary) => {
tracing::info!( tracing::info!(
"[Compaction] Generated LLM summary ({} chars)", "[Compaction] Generated LLM summary ({} chars, iterative={})",
llm_summary.len() llm_summary.len(),
previous_summary.is_some()
); );
llm_summary llm_summary
} }
@@ -350,7 +397,7 @@ pub async fn maybe_compact_with_config(
"[Compaction] LLM summary failed: {}, falling back to rules", "[Compaction] LLM summary failed: {}, falling back to rules",
e e
); );
generate_summary(old_messages) generate_summary(old_messages, prev_ref)
} else { } else {
tracing::warn!( tracing::warn!(
"[Compaction] LLM summary failed: {}, returning original messages", "[Compaction] LLM summary failed: {}, returning original messages",
@@ -369,10 +416,10 @@ pub async fn maybe_compact_with_config(
tracing::warn!( tracing::warn!(
"[Compaction] LLM compaction requested but no driver available, using rules" "[Compaction] LLM compaction requested but no driver available, using rules"
); );
generate_summary(old_messages) generate_summary(old_messages, prev_ref)
} }
} else { } else {
generate_summary(old_messages) generate_summary(old_messages, prev_ref)
}; };
let used_llm = config.use_llm && driver.is_some(); let used_llm = config.use_llm && driver.is_some();
@@ -398,9 +445,11 @@ pub async fn maybe_compact_with_config(
} }
/// Generate a summary using an LLM driver. /// Generate a summary using an LLM driver.
/// If `previous_summary` is provided, builds on it iteratively.
async fn generate_llm_summary( async fn generate_llm_summary(
driver: &Arc<dyn LlmDriver>, driver: &Arc<dyn LlmDriver>,
messages: &[Message], messages: &[Message],
previous_summary: Option<&str>,
max_tokens: u32, max_tokens: u32,
) -> Result<String, String> { ) -> Result<String, String> {
let mut conversation_text = String::new(); let mut conversation_text = String::new();
@@ -437,11 +486,21 @@ async fn generate_llm_summary(
conversation_text.push_str("\n...(对话已截断)"); conversation_text.push_str("\n...(对话已截断)");
} }
let prompt = format!( let prompt = match previous_summary {
Some(prev) => format!(
"你是一个对话摘要助手。\n\n\
## 上一轮摘要\n{}\n\n\
## 新增对话内容\n{}\n\n\
请在上一轮摘要的基础上更新,保留所有关键决策、用户偏好和文件操作。\
输出200字以内的中文摘要。",
prev, conversation_text
),
None => format!(
"请用简洁的中文总结以下对话的关键信息。保留重要的讨论主题、决策、结论和待办事项。\ "请用简洁的中文总结以下对话的关键信息。保留重要的讨论主题、决策、结论和待办事项。\
输出格式为段落式摘要不超过200字。\n\n{}", 输出格式为段落式摘要不超过200字。\n\n{}",
conversation_text conversation_text
); ),
};
let request = CompletionRequest { let request = CompletionRequest {
model: String::new(), model: String::new(),
@@ -484,13 +543,22 @@ async fn generate_llm_summary(
} }
/// Generate a rule-based summary of old messages. /// Generate a rule-based summary of old messages.
fn generate_summary(messages: &[Message]) -> String { /// If `previous_summary` is provided, carries forward key info.
fn generate_summary(messages: &[Message], previous_summary: Option<&str>) -> String {
if messages.is_empty() { if messages.is_empty() {
return "[对话开始]".to_string(); return "[对话开始]".to_string();
} }
let mut sections: Vec<String> = vec!["[以下是之前对话的摘要]".to_string()]; let mut sections: Vec<String> = vec!["[以下是之前对话的摘要]".to_string()];
// Carry forward previous summary if available
if let Some(prev) = previous_summary {
// Strip the header line from previous summary for cleaner nesting
let prev_body = prev.strip_prefix("[以下是之前对话的摘要]\n")
.unwrap_or(prev);
sections.push(format!("[上轮摘要保留]: {}", truncate(prev_body, 200)));
}
let mut user_count = 0; let mut user_count = 0;
let mut assistant_count = 0; let mut assistant_count = 0;
let mut topics: Vec<String> = Vec::new(); let mut topics: Vec<String> = Vec::new();
@@ -696,8 +764,21 @@ mod tests {
Message::user("How does ownership work?"), Message::user("How does ownership work?"),
Message::assistant("Ownership is Rust's memory management system"), Message::assistant("Ownership is Rust's memory management system"),
]; ];
let summary = generate_summary(&messages); let summary = generate_summary(&messages, None);
assert!(summary.contains("摘要")); assert!(summary.contains("摘要"));
assert!(summary.contains("2")); assert!(summary.contains("2"));
} }
#[test]
fn test_generate_summary_iterative() {
let messages = vec![
Message::user("What is async/await?"),
Message::assistant("Async/await is a concurrency model"),
];
let prev = "[以下是之前对话的摘要]\n讨论主题: Rust; 所有权\n(已压缩 4 条消息)";
let summary = generate_summary(&messages, Some(prev));
assert!(summary.contains("摘要"));
assert!(summary.contains("上轮摘要保留"));
assert!(summary.contains("所有权"));
}
} }

View File

@@ -121,6 +121,8 @@ impl LlmDriver for AnthropicDriver {
let mut byte_stream = response.bytes_stream(); let mut byte_stream = response.bytes_stream();
let mut current_tool_id: Option<String> = None; let mut current_tool_id: Option<String> = None;
let mut tool_input_buffer = String::new(); let mut tool_input_buffer = String::new();
let mut cache_creation_input_tokens: Option<u32> = None;
let mut cache_read_input_tokens: Option<u32> = None;
while let Some(chunk_result) = byte_stream.next().await { while let Some(chunk_result) = byte_stream.next().await {
let chunk = match chunk_result { let chunk = match chunk_result {
@@ -141,6 +143,15 @@ impl LlmDriver for AnthropicDriver {
match serde_json::from_str::<AnthropicStreamEvent>(data) { match serde_json::from_str::<AnthropicStreamEvent>(data) {
Ok(event) => { Ok(event) => {
match event.event_type.as_str() { match event.event_type.as_str() {
"message_start" => {
// Capture cache token info from message_start event
if let Some(msg) = event.message {
if let Some(usage) = msg.usage {
cache_creation_input_tokens = usage.cache_creation_input_tokens;
cache_read_input_tokens = usage.cache_read_input_tokens;
}
}
}
"content_block_delta" => { "content_block_delta" => {
if let Some(delta) = event.delta { if let Some(delta) = event.delta {
if let Some(text) = delta.text { if let Some(text) = delta.text {
@@ -186,6 +197,8 @@ impl LlmDriver for AnthropicDriver {
input_tokens: msg.usage.as_ref().map(|u| u.input_tokens).unwrap_or(0), input_tokens: msg.usage.as_ref().map(|u| u.input_tokens).unwrap_or(0),
output_tokens: msg.usage.as_ref().map(|u| u.output_tokens).unwrap_or(0), output_tokens: msg.usage.as_ref().map(|u| u.output_tokens).unwrap_or(0),
stop_reason: msg.stop_reason.unwrap_or_else(|| "end_turn".to_string()), stop_reason: msg.stop_reason.unwrap_or_else(|| "end_turn".to_string()),
cache_creation_input_tokens,
cache_read_input_tokens,
}); });
} }
} }
@@ -298,7 +311,15 @@ impl AnthropicDriver {
AnthropicRequest { AnthropicRequest {
model: request.model.clone(), model: request.model.clone(),
max_tokens: effective_max, max_tokens: effective_max,
system: request.system.clone(), system: request.system.as_ref().map(|s| {
vec![SystemContentBlock {
r#type: "text".to_string(),
text: s.clone(),
cache_control: Some(CacheControl {
r#type: "ephemeral".to_string(),
}),
}]
}),
messages, messages,
tools: if tools.is_empty() { None } else { Some(tools) }, tools: if tools.is_empty() { None } else { Some(tools) },
temperature: request.temperature, temperature: request.temperature,
@@ -337,18 +358,35 @@ impl AnthropicDriver {
input_tokens: api_response.usage.input_tokens, input_tokens: api_response.usage.input_tokens,
output_tokens: api_response.usage.output_tokens, output_tokens: api_response.usage.output_tokens,
stop_reason, stop_reason,
cache_creation_input_tokens: api_response.usage.cache_creation_input_tokens,
cache_read_input_tokens: api_response.usage.cache_read_input_tokens,
} }
} }
} }
// Anthropic API types // Anthropic API types
/// Anthropic cache_control 标记
#[derive(Serialize, Clone)]
struct CacheControl {
r#type: String, // "ephemeral"
}
/// Anthropic system prompt 内容块(支持 cache_control
#[derive(Serialize, Clone)]
struct SystemContentBlock {
r#type: String, // "text"
text: String,
#[serde(skip_serializing_if = "Option::is_none")]
cache_control: Option<CacheControl>,
}
#[derive(Serialize)] #[derive(Serialize)]
struct AnthropicRequest { struct AnthropicRequest {
model: String, model: String,
max_tokens: u32, max_tokens: u32,
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
system: Option<String>, system: Option<Vec<SystemContentBlock>>,
messages: Vec<AnthropicMessage>, messages: Vec<AnthropicMessage>,
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
tools: Option<Vec<AnthropicTool>>, tools: Option<Vec<AnthropicTool>>,
@@ -404,6 +442,10 @@ struct AnthropicContentBlock {
struct AnthropicUsage { struct AnthropicUsage {
input_tokens: u32, input_tokens: u32,
output_tokens: u32, output_tokens: u32,
#[serde(default)]
cache_creation_input_tokens: Option<u32>,
#[serde(default)]
cache_read_input_tokens: Option<u32>,
} }
// Streaming types // Streaming types
@@ -458,4 +500,8 @@ struct AnthropicStreamUsage {
input_tokens: u32, input_tokens: u32,
#[serde(default)] #[serde(default)]
output_tokens: u32, output_tokens: u32,
#[serde(default)]
cache_creation_input_tokens: Option<u32>,
#[serde(default)]
cache_read_input_tokens: Option<u32>,
} }

View File

@@ -0,0 +1,139 @@
//! LLM 错误分类器。将 HTTP 状态码 + 错误体映射为 LlmErrorKind。
use std::time::Duration;
use zclaw_types::{LlmErrorKind, ClassifiedLlmError};
/// 分类 LLM 错误
pub fn classify_llm_error(
provider: &str,
status: u16,
body: &str,
is_timeout: bool,
) -> ClassifiedLlmError {
let _ = provider; // reserved for per-provider overrides
if is_timeout {
return ClassifiedLlmError {
kind: LlmErrorKind::Timeout,
retryable: true,
should_compress: false,
should_rotate_credential: false,
retry_after: None,
message: "请求超时".to_string(),
};
}
match status {
401 | 403 => ClassifiedLlmError {
kind: LlmErrorKind::Auth,
retryable: false,
should_compress: false,
should_rotate_credential: true,
retry_after: None,
message: "认证失败,请检查 API Key".to_string(),
},
402 => {
let is_quota_transient = body.contains("retry")
|| body.contains("limit")
|| body.contains("usage");
ClassifiedLlmError {
kind: if is_quota_transient { LlmErrorKind::RateLimited } else { LlmErrorKind::BillingExhausted },
retryable: is_quota_transient,
should_compress: false,
should_rotate_credential: !is_quota_transient,
retry_after: if is_quota_transient { Some(Duration::from_secs(30)) } else { None },
message: if is_quota_transient { "使用限制,稍后重试".to_string() } else { "计费额度已耗尽".to_string() },
}
}
429 => ClassifiedLlmError {
kind: LlmErrorKind::RateLimited,
retryable: true,
should_compress: false,
should_rotate_credential: true,
retry_after: parse_retry_after(body),
message: "速率限制".to_string(),
},
529 => ClassifiedLlmError {
kind: LlmErrorKind::Overloaded,
retryable: true,
should_compress: false,
should_rotate_credential: false,
retry_after: Some(Duration::from_secs(5)),
message: "提供商过载".to_string(),
},
500 | 502 => ClassifiedLlmError {
kind: LlmErrorKind::ServerError,
retryable: true,
should_compress: false,
should_rotate_credential: false,
retry_after: None,
message: "服务端错误".to_string(),
},
503 => ClassifiedLlmError {
kind: LlmErrorKind::Overloaded,
retryable: true,
should_compress: false,
should_rotate_credential: false,
retry_after: Some(Duration::from_secs(3)),
message: "服务暂时不可用".to_string(),
},
400 => {
let is_context_overflow = body.contains("context_length")
|| body.contains("max_tokens")
|| body.contains("too many tokens")
|| body.contains("prompt is too long");
ClassifiedLlmError {
kind: if is_context_overflow { LlmErrorKind::ContextOverflow } else { LlmErrorKind::Unknown },
retryable: false,
should_compress: is_context_overflow,
should_rotate_credential: false,
retry_after: None,
message: if is_context_overflow {
"上下文过长,需要压缩".to_string()
} else {
format!("请求错误: {}", &body[..body.len().min(200)])
},
}
}
404 => ClassifiedLlmError {
kind: LlmErrorKind::ModelNotFound,
retryable: false,
should_compress: false,
should_rotate_credential: false,
retry_after: None,
message: "模型不存在".to_string(),
},
_ => ClassifiedLlmError {
kind: LlmErrorKind::Unknown,
retryable: true,
should_compress: false,
should_rotate_credential: false,
retry_after: None,
message: format!("未知错误 ({}) {}", status, &body[..body.len().min(200)]),
},
}
}
fn parse_retry_after(body: &str) -> Option<Duration> {
// Anthropic: "Please retry after X seconds"
// OpenAI: "Please retry after Xms"
if let Some(secs) = extract_retry_seconds(body) {
return Some(Duration::from_secs(secs));
}
if let Some(ms) = extract_retry_millis(body) {
return Some(Duration::from_millis(ms));
}
Some(Duration::from_secs(2))
}
fn extract_retry_seconds(body: &str) -> Option<u64> {
let re = regex::Regex::new(r"retry\s+(?:after\s+)?(\d+)\s*(?:s|sec|seconds?)").ok()?;
let caps = re.captures(body)?;
caps[1].parse().ok()
}
fn extract_retry_millis(body: &str) -> Option<u64> {
let re = regex::Regex::new(r"retry\s+(?:after\s+)?(\d+)\s*ms").ok()?;
let caps = re.captures(body)?;
caps[1].parse().ok()
}

View File

@@ -238,6 +238,8 @@ impl LlmDriver for GeminiDriver {
input_tokens, input_tokens,
output_tokens, output_tokens,
stop_reason: stop_reason.to_string(), stop_reason: stop_reason.to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}); });
} }
} }
@@ -500,6 +502,8 @@ impl GeminiDriver {
input_tokens, input_tokens,
output_tokens, output_tokens,
stop_reason, stop_reason,
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
} }
} }
} }

View File

@@ -238,6 +238,8 @@ impl LocalDriver {
input_tokens, input_tokens,
output_tokens, output_tokens,
stop_reason, stop_reason,
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
} }
} }
@@ -396,6 +398,8 @@ impl LlmDriver for LocalDriver {
input_tokens: 0, input_tokens: 0,
output_tokens: 0, output_tokens: 0,
stop_reason: "end_turn".to_string(), stop_reason: "end_turn".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}); });
continue; continue;
} }

View File

@@ -15,11 +15,14 @@ mod anthropic;
mod openai; mod openai;
mod gemini; mod gemini;
mod local; mod local;
mod error_classifier;
mod retry_driver;
pub use anthropic::AnthropicDriver; pub use anthropic::AnthropicDriver;
pub use openai::OpenAiDriver; pub use openai::OpenAiDriver;
pub use gemini::GeminiDriver; pub use gemini::GeminiDriver;
pub use local::LocalDriver; pub use local::LocalDriver;
pub use retry_driver::{RetryDriver, RetryConfig};
/// LLM Driver trait - unified interface for all providers /// LLM Driver trait - unified interface for all providers
#[async_trait] #[async_trait]
@@ -106,6 +109,12 @@ pub struct CompletionResponse {
pub output_tokens: u32, pub output_tokens: u32,
/// Stop reason /// Stop reason
pub stop_reason: StopReason, pub stop_reason: StopReason,
/// Cache creation input tokens (Anthropic prompt caching)
#[serde(default)]
pub cache_creation_input_tokens: Option<u32>,
/// Cache read input tokens (Anthropic prompt caching)
#[serde(default)]
pub cache_read_input_tokens: Option<u32>,
} }
/// LLM driver response content block (subset of canonical zclaw_types::ContentBlock). /// LLM driver response content block (subset of canonical zclaw_types::ContentBlock).

View File

@@ -222,10 +222,13 @@ impl LlmDriver for OpenAiDriver {
let parsed_args: serde_json::Value = if args.is_empty() { let parsed_args: serde_json::Value = if args.is_empty() {
serde_json::json!({}) serde_json::json!({})
} else { } else {
serde_json::from_str(args).unwrap_or_else(|e| { match serde_json::from_str(args) {
tracing::warn!("[OpenAI] Failed to parse tool args '{}': {}, using empty object", args, e); Ok(v) => v,
serde_json::json!({}) Err(e) => {
}) tracing::error!("[OpenAI] Failed to parse tool call '{}' args: {}. Raw: {}", name, e, &args[..args.len().min(200)]);
serde_json::json!({ "_parse_error": e.to_string(), "_raw_args": args[..args.len().min(500)].to_string() })
}
}
}; };
yield Ok(StreamChunk::ToolUseEnd { yield Ok(StreamChunk::ToolUseEnd {
id: id.clone(), id: id.clone(),
@@ -237,6 +240,8 @@ impl LlmDriver for OpenAiDriver {
input_tokens: 0, input_tokens: 0,
output_tokens: 0, output_tokens: 0,
stop_reason: "end_turn".to_string(), stop_reason: "end_turn".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}); });
continue; continue;
} }
@@ -638,6 +643,8 @@ impl OpenAiDriver {
input_tokens, input_tokens,
output_tokens, output_tokens,
stop_reason, stop_reason,
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
} }
} }
@@ -761,6 +768,8 @@ impl OpenAiDriver {
StopReason::StopSequence => "stop", StopReason::StopSequence => "stop",
StopReason::Error => "error", StopReason::Error => "error",
}.to_string(), }.to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}); });
}) })
} }

View File

@@ -0,0 +1,123 @@
//! RetryDriver: LlmDriver 的重试装饰器。
//! 仅在本地 Kernel 路径使用SaaS Relay 已有自己的重试逻辑。
use std::sync::Arc;
use std::time::Duration;
use async_trait::async_trait;
use futures::Stream;
use rand::Rng;
use zclaw_types::{Result, ZclawError};
use super::{LlmDriver, CompletionRequest, CompletionResponse, StreamChunk};
use super::error_classifier::classify_llm_error;
/// 重试配置
#[derive(Debug, Clone)]
pub struct RetryConfig {
pub max_attempts: u32,
pub base_delay_secs: f64,
pub max_delay_secs: f64,
pub jitter_ratio: f64,
}
impl Default for RetryConfig {
fn default() -> Self {
Self {
max_attempts: 3,
base_delay_secs: 1.0,
max_delay_secs: 8.0,
jitter_ratio: 0.5,
}
}
}
/// 重试装饰器
pub struct RetryDriver {
inner: Arc<dyn LlmDriver>,
config: RetryConfig,
}
impl RetryDriver {
pub fn new(inner: Arc<dyn LlmDriver>, config: RetryConfig) -> Self {
Self { inner, config }
}
fn jittered_backoff(&self, attempt: u32) -> Duration {
let base = self.config.base_delay_secs * 2_f64.powi(attempt as i32);
let capped = base.min(self.config.max_delay_secs);
let mut rng = rand::thread_rng();
let jitter = capped * self.config.jitter_ratio * rng.gen::<f64>();
Duration::from_secs_f64(capped + jitter)
}
}
#[async_trait]
impl LlmDriver for RetryDriver {
fn provider(&self) -> &str {
self.inner.provider()
}
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse> {
let mut last_error: Option<ZclawError> = None;
for attempt in 0..self.config.max_attempts {
match self.inner.complete(request.clone()).await {
Ok(response) => return Ok(response),
Err(e) => {
let message = e.to_string();
let status = extract_status_from_error(&message);
let classified = classify_llm_error(
self.inner.provider(),
status,
&message,
message.contains("timeout") || message.contains("Timeout"),
);
if !classified.retryable {
return Err(e);
}
if classified.should_compress {
return Err(ZclawError::LlmError(
format!("[CONTEXT_OVERFLOW] {}", message)
));
}
last_error = Some(e);
if attempt + 1 < self.config.max_attempts {
let delay = classified.retry_after
.unwrap_or_else(|| self.jittered_backoff(attempt));
tracing::warn!(
"[RetryDriver] Attempt {}/{} failed ({}), retrying in {:.1}s",
attempt + 1, self.config.max_attempts, classified.message,
delay.as_secs_f64()
);
tokio::time::sleep(delay).await;
}
}
}
}
Err(last_error.unwrap_or_else(|| ZclawError::LlmError("重试耗尽".to_string())))
}
fn stream(
&self,
request: CompletionRequest,
) -> std::pin::Pin<Box<dyn Stream<Item = Result<StreamChunk>> + Send + '_>> {
// 流式路径不重试——部分 delta 已发送,重试会导致 UI 重复
self.inner.stream(request)
}
fn is_configured(&self) -> bool {
self.inner.is_configured()
}
}
fn extract_status_from_error(message: &str) -> u16 {
let re = regex::Regex::new(r"(?:error|status)[:\s]+(\d{3})").ok();
re.and_then(|re| re.captures(message))
.and_then(|caps| caps[1].parse().ok())
.unwrap_or(0)
}

View File

@@ -4,10 +4,11 @@ use std::sync::Arc;
use futures::StreamExt; use futures::StreamExt;
use tokio::sync::mpsc; use tokio::sync::mpsc;
use zclaw_types::{AgentId, SessionId, Message, Result}; use zclaw_types::{AgentId, SessionId, Message, Result};
use serde_json::Value;
use crate::driver::{LlmDriver, CompletionRequest, ContentBlock}; use crate::driver::{LlmDriver, CompletionRequest, ContentBlock};
use crate::stream::StreamChunk; use crate::stream::StreamChunk;
use crate::tool::{ToolRegistry, ToolContext, SkillExecutor, HandExecutor}; use crate::tool::{ToolRegistry, ToolContext, SkillExecutor, HandExecutor, ToolConcurrency};
use crate::tool::builtin::PathValidator; use crate::tool::builtin::PathValidator;
use crate::growth::GrowthIntegration; use crate::growth::GrowthIntegration;
use crate::compaction::{self, CompactionConfig}; use crate::compaction::{self, CompactionConfig};
@@ -303,8 +304,28 @@ impl AgentLoop {
plan_mode: self.plan_mode, plan_mode: self.plan_mode,
}; };
// Call LLM // Call LLM with context-overflow recovery
let response = self.driver.complete(request).await?; let response = match self.driver.complete(request).await {
Ok(r) => r,
Err(e) => {
let err_str = e.to_string();
if err_str.contains("[CONTEXT_OVERFLOW]") && self.compaction_threshold > 0 {
tracing::warn!("[AgentLoop] Context overflow detected, triggering emergency compaction");
let pruned = compaction::prune_tool_outputs(&mut messages);
if pruned > 0 {
tracing::info!("[AgentLoop] Emergency pruning removed {} tool outputs", pruned);
}
let keep_recent = messages.len().saturating_sub(messages.len() / 3);
let (compacted, removed) = compaction::compact_messages(messages, keep_recent.max(4));
if removed > 0 {
tracing::info!("[AgentLoop] Emergency compaction removed {} messages", removed);
messages = compacted;
continue; // retry the iteration with compacted messages
}
}
return Err(e);
}
};
total_input_tokens += response.input_tokens; total_input_tokens += response.input_tokens;
total_output_tokens += response.output_tokens; total_output_tokens += response.output_tokens;
@@ -375,21 +396,22 @@ impl AgentLoop {
let tool_context = self.create_tool_context(session_id.clone()); let tool_context = self.create_tool_context(session_id.clone());
let mut abort_result: Option<AgentLoopResult> = None; let mut abort_result: Option<AgentLoopResult> = None;
let mut clarification_result: Option<AgentLoopResult> = None; let mut clarification_result: Option<AgentLoopResult> = None;
for (id, name, input) in tool_calls {
// Check if loop was already aborted // Phase 1: Pre-process inputs + middleware checks (serial)
if abort_result.is_some() { struct ToolPlan {
break; idx: usize,
id: String,
name: String,
input: Value,
} }
let mut plans: Vec<ToolPlan> = Vec::new();
for (idx, (id, name, input)) in tool_calls.into_iter().enumerate() {
if abort_result.is_some() { break; }
// GLM and other models sometimes send tool calls with empty arguments `{}` // GLM and other models sometimes send tool calls with empty arguments `{}`
// Inject the last user message as a fallback query so the tool can infer intent.
let input = if input.as_object().map_or(false, |obj| obj.is_empty()) { let input = if input.as_object().map_or(false, |obj| obj.is_empty()) {
if let Some(last_user_msg) = messages.iter().rev().find_map(|m| { if let Some(last_user_msg) = messages.iter().rev().find_map(|m| {
if let Message::User { content } = m { if let Message::User { content } = m { Some(content.clone()) } else { None }
Some(content.clone())
} else {
None
}
}) { }) {
tracing::info!("[AgentLoop] Tool '{}' received empty input, injecting user message as fallback query", name); tracing::info!("[AgentLoop] Tool '{}' received empty input, injecting user message as fallback query", name);
serde_json::json!({ "_fallback_query": last_user_msg }) serde_json::json!({ "_fallback_query": last_user_msg })
@@ -400,9 +422,7 @@ impl AgentLoop {
input input
}; };
// Check tool call safety — via middleware chain let mw_ctx = middleware::MiddlewareContext {
{
let mw_ctx_ref = middleware::MiddlewareContext {
agent_id: self.agent_id.clone(), agent_id: self.agent_id.clone(),
session_id: session_id.clone(), session_id: session_id.clone(),
user_input: input.to_string(), user_input: input.to_string(),
@@ -412,29 +432,16 @@ impl AgentLoop {
input_tokens: total_input_tokens, input_tokens: total_input_tokens,
output_tokens: total_output_tokens, output_tokens: total_output_tokens,
}; };
match self.middleware_chain.run_before_tool_call(&mw_ctx_ref, &name, &input).await? { match self.middleware_chain.run_before_tool_call(&mw_ctx, &name, &input).await? {
middleware::ToolCallDecision::Allow => {} middleware::ToolCallDecision::Allow => {
plans.push(ToolPlan { idx, id, name, input });
}
middleware::ToolCallDecision::Block(msg) => { middleware::ToolCallDecision::Block(msg) => {
tracing::warn!("[AgentLoop] Tool '{}' blocked by middleware: {}", name, msg); tracing::warn!("[AgentLoop] Tool '{}' blocked by middleware: {}", name, msg);
let error_output = serde_json::json!({ "error": msg }); messages.push(Message::tool_result(&id, zclaw_types::ToolId::new(&name), serde_json::json!({ "error": msg }), true));
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true));
continue;
} }
middleware::ToolCallDecision::ReplaceInput(new_input) => { middleware::ToolCallDecision::ReplaceInput(new_input) => {
// Execute with replaced input (with timeout) plans.push(ToolPlan { idx, id, name, input: new_input });
let tool_result = match tokio::time::timeout(
std::time::Duration::from_secs(30),
self.execute_tool(&name, new_input, &tool_context),
).await {
Ok(Ok(result)) => result,
Ok(Err(e)) => serde_json::json!({ "error": e.to_string() }),
Err(_) => {
tracing::warn!("[AgentLoop] Tool '{}' (replaced input) timed out after 30s", name);
serde_json::json!({ "error": format!("工具 '{}' 执行超时30秒请重试", name) })
}
};
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), tool_result, false));
continue;
} }
middleware::ToolCallDecision::AbortLoop(reason) => { middleware::ToolCallDecision::AbortLoop(reason) => {
tracing::warn!("[AgentLoop] Loop aborted by middleware: {}", reason); tracing::warn!("[AgentLoop] Loop aborted by middleware: {}", reason);
@@ -450,21 +457,76 @@ impl AgentLoop {
} }
} }
// Phase 2: Execute tools (parallel for ReadOnly, serial for others)
if abort_result.is_none() && !plans.is_empty() {
let (parallel_plans, sequential_plans): (Vec<_>, Vec<_>) = plans.iter()
.partition(|p| {
self.tools.get(&p.name)
.map(|t| t.concurrency())
.unwrap_or(ToolConcurrency::Exclusive) == ToolConcurrency::ReadOnly
});
let mut results: std::collections::HashMap<usize, (String, String, serde_json::Value)> = std::collections::HashMap::new();
// Execute parallel (ReadOnly) tools with JoinSet (max 3 concurrent)
if !parallel_plans.is_empty() {
let semaphore = Arc::new(tokio::sync::Semaphore::new(3));
let mut join_set = tokio::task::JoinSet::new();
for plan in &parallel_plans {
let tool = self.tools.get(&plan.name).unwrap();
let ctx = tool_context.clone();
let input = plan.input.clone();
let idx = plan.idx;
let id = plan.id.clone();
let name = plan.name.clone();
let permit = semaphore.clone().acquire_owned().await.unwrap();
join_set.spawn(async move {
let result = tokio::time::timeout(
std::time::Duration::from_secs(30),
tool.execute(input, &ctx)
).await;
drop(permit);
(idx, id, name, result)
});
}
while let Some(res) = join_set.join_next().await {
match res {
Ok((idx, id, name, Ok(Ok(value)))) => {
results.insert(idx, (id, name, value));
}
Ok((idx, id, name, Ok(Err(e)))) => {
results.insert(idx, (id, name, serde_json::json!({ "error": e.to_string() })));
}
Ok((idx, id, name, Err(_))) => {
tracing::warn!("[AgentLoop] Tool '{}' timed out after 30s (parallel)", name);
results.insert(idx, (id, name.clone(), serde_json::json!({ "error": format!("工具 '{}' 执行超时30秒请重试", name) })));
}
Err(e) => {
tracing::warn!("[AgentLoop] JoinError in parallel tool execution: {}", e);
}
}
}
}
// Execute sequential (Exclusive/Interactive) tools
for plan in &sequential_plans {
let tool_result = match tokio::time::timeout( let tool_result = match tokio::time::timeout(
std::time::Duration::from_secs(30), std::time::Duration::from_secs(30),
self.execute_tool(&name, input, &tool_context), self.execute_tool(&plan.name, plan.input.clone(), &tool_context),
).await { ).await {
Ok(Ok(result)) => result, Ok(Ok(result)) => result,
Ok(Err(e)) => serde_json::json!({ "error": e.to_string() }), Ok(Err(e)) => serde_json::json!({ "error": e.to_string() }),
Err(_) => { Err(_) => {
tracing::warn!("[AgentLoop] Tool '{}' timed out after 30s", name); tracing::warn!("[AgentLoop] Tool '{}' timed out after 30s", plan.name);
serde_json::json!({ "error": format!("工具 '{}' 执行超时30秒请重试", name) }) serde_json::json!({ "error": format!("工具 '{}' 执行超时30秒请重试", plan.name) })
} }
}; };
// Check if this is a clarification response — terminate loop immediately // Check if this is a clarification response
// so the LLM waits for user input instead of continuing to generate. if plan.name == "ask_clarification"
if name == "ask_clarification"
&& tool_result.get("status").and_then(|v| v.as_str()) == Some("clarification_needed") && tool_result.get("status").and_then(|v| v.as_str()) == Some("clarification_needed")
{ {
tracing::info!("[AgentLoop] Clarification requested, terminating loop"); tracing::info!("[AgentLoop] Clarification requested, terminating loop");
@@ -472,12 +534,7 @@ impl AgentLoop {
.and_then(|v| v.as_str()) .and_then(|v| v.as_str())
.unwrap_or("需要更多信息") .unwrap_or("需要更多信息")
.to_string(); .to_string();
messages.push(Message::tool_result( results.insert(plan.idx, (plan.id.clone(), plan.name.clone(), tool_result));
id,
zclaw_types::ToolId::new(&name),
tool_result,
false,
));
self.memory.append_message(&session_id, &Message::assistant(&question)).await?; self.memory.append_message(&session_id, &Message::assistant(&question)).await?;
clarification_result = Some(AgentLoopResult { clarification_result = Some(AgentLoopResult {
response: question, response: question,
@@ -487,14 +544,30 @@ impl AgentLoop {
}); });
break; break;
} }
results.insert(plan.idx, (plan.id.clone(), plan.name.clone(), tool_result));
}
// Add tool result to messages // Push results in original tool_call order
messages.push(Message::tool_result( let mut sorted_indices: Vec<usize> = results.keys().copied().collect();
id, sorted_indices.sort();
zclaw_types::ToolId::new(&name), for idx in sorted_indices {
tool_result, let (id, name, result) = results.remove(&idx).unwrap();
false, // is_error - we include errors in the result itself // Run after_tool_call middleware (error counting, output guard, etc.)
)); let mut mw_ctx = middleware::MiddlewareContext {
agent_id: self.agent_id.clone(),
session_id: session_id.clone(),
user_input: String::new(),
system_prompt: enhanced_prompt.clone(),
messages: messages.clone(),
response_content: Vec::new(),
input_tokens: total_input_tokens,
output_tokens: total_output_tokens,
};
if let Err(e) = self.middleware_chain.run_after_tool_call(&mut mw_ctx, &name, &result).await {
tracing::warn!("[AgentLoop] after_tool_call middleware failed for '{}': {}", name, e);
}
messages.push(Message::tool_result(&id, zclaw_types::ToolId::new(&name), result, false));
}
} }
// Continue the loop - LLM will process tool results and generate final response // Continue the loop - LLM will process tool results and generate final response
@@ -647,6 +720,7 @@ impl AgentLoop {
let mut stream = driver.stream(request); let mut stream = driver.stream(request);
let mut pending_tool_calls: Vec<(String, String, serde_json::Value)> = Vec::new(); let mut pending_tool_calls: Vec<(String, String, serde_json::Value)> = Vec::new();
let mut completed_tool_ids: std::collections::HashSet<String> = std::collections::HashSet::new();
let mut iteration_text = String::new(); let mut iteration_text = String::new();
let mut reasoning_text = String::new(); // Track reasoning separately for API requirement let mut reasoning_text = String::new(); // Track reasoning separately for API requirement
@@ -703,6 +777,7 @@ impl AgentLoop {
// Update with final parsed input and emit ToolStart event // Update with final parsed input and emit ToolStart event
if let Some(tool) = pending_tool_calls.iter_mut().find(|(tid, _, _)| tid == id) { if let Some(tool) = pending_tool_calls.iter_mut().find(|(tid, _, _)| tid == id) {
tool.2 = input.clone(); tool.2 = input.clone();
completed_tool_ids.insert(id.clone());
if let Err(e) = tx.send(LoopEvent::ToolStart { name: tool.1.clone(), input: input.clone() }).await { if let Err(e) = tx.send(LoopEvent::ToolStart { name: tool.1.clone(), input: input.clone() }).await {
tracing::warn!("[AgentLoop] Failed to send ToolStart event: {}", e); tracing::warn!("[AgentLoop] Failed to send ToolStart event: {}", e);
} }
@@ -810,11 +885,27 @@ impl AgentLoop {
break 'outer; break 'outer;
} }
// Skip tool processing if stream errored or timed out // Handle stream errors — execute complete tool calls, cancel incomplete ones
if stream_errored { if stream_errored {
tracing::debug!("[AgentLoop] Stream errored, skipping tool processing and breaking"); // Cancel incomplete tools (ToolStart sent but ToolUseEnd not received)
let incomplete: Vec<_> = pending_tool_calls.iter()
.filter(|(id, _, _)| !completed_tool_ids.contains(id))
.collect();
for (_, name, _) in &incomplete {
tracing::warn!("[AgentLoop] Cancelling incomplete tool '{}' due to stream error", name);
let error_output = serde_json::json!({ "error": "流式响应中断,工具调用未完成" });
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output }).await {
tracing::warn!("[AgentLoop] Failed to send cancellation ToolEnd event: {}", e);
}
}
// Retain only complete tools for execution
pending_tool_calls.retain(|(id, _, _)| completed_tool_ids.contains(id));
if pending_tool_calls.is_empty() {
tracing::debug!("[AgentLoop] Stream errored with no complete tool calls, breaking");
break 'outer; break 'outer;
} }
tracing::info!("[AgentLoop] Stream errored but executing {} complete tool calls", pending_tool_calls.len());
}
tracing::debug!("[AgentLoop] Processing {} tool calls (reasoning: {} chars)", pending_tool_calls.len(), reasoning_text.len()); tracing::debug!("[AgentLoop] Processing {} tool calls (reasoning: {} chars)", pending_tool_calls.len(), reasoning_text.len());
@@ -830,12 +921,12 @@ impl AgentLoop {
messages.push(Message::tool_use(id, zclaw_types::ToolId::new(name), input.clone())); messages.push(Message::tool_use(id, zclaw_types::ToolId::new(name), input.clone()));
} }
// Execute tools // Execute tools — Phase 1: Pre-process through middleware (serial)
for (id, name, input) in pending_tool_calls { struct StreamToolPlan { idx: usize, id: String, name: String, input: Value }
tracing::debug!("[AgentLoop] Executing tool: name={}, input={:?}", name, input); let mut plans: Vec<StreamToolPlan> = Vec::new();
let mut abort_loop = false;
// Check tool call safety — via middleware chain for (idx, (id, name, input)) in pending_tool_calls.into_iter().enumerate() {
{ if abort_loop { break; }
let mw_ctx = middleware::MiddlewareContext { let mw_ctx = middleware::MiddlewareContext {
agent_id: agent_id.clone(), agent_id: agent_id.clone(),
session_id: session_id_clone.clone(), session_id: session_id_clone.clone(),
@@ -847,7 +938,9 @@ impl AgentLoop {
output_tokens: total_output_tokens, output_tokens: total_output_tokens,
}; };
match middleware_chain.run_before_tool_call(&mw_ctx, &name, &input).await { match middleware_chain.run_before_tool_call(&mw_ctx, &name, &input).await {
Ok(middleware::ToolCallDecision::Allow) => {} Ok(middleware::ToolCallDecision::Allow) => {
plans.push(StreamToolPlan { idx, id, name, input });
}
Ok(middleware::ToolCallDecision::Block(msg)) => { Ok(middleware::ToolCallDecision::Block(msg)) => {
tracing::warn!("[AgentLoop] Tool '{}' blocked by middleware: {}", name, msg); tracing::warn!("[AgentLoop] Tool '{}' blocked by middleware: {}", name, msg);
let error_output = serde_json::json!({ "error": msg }); let error_output = serde_json::json!({ "error": msg });
@@ -855,59 +948,16 @@ impl AgentLoop {
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e); tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
} }
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true)); messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true));
continue; }
Ok(middleware::ToolCallDecision::ReplaceInput(new_input)) => {
plans.push(StreamToolPlan { idx, id, name, input: new_input });
} }
Ok(middleware::ToolCallDecision::AbortLoop(reason)) => { Ok(middleware::ToolCallDecision::AbortLoop(reason)) => {
tracing::warn!("[AgentLoop] Loop aborted by middleware: {}", reason); tracing::warn!("[AgentLoop] Loop aborted by middleware: {}", reason);
if let Err(e) = tx.send(LoopEvent::Error(reason)).await { if let Err(e) = tx.send(LoopEvent::Error(reason)).await {
tracing::warn!("[AgentLoop] Failed to send Error event: {}", e); tracing::warn!("[AgentLoop] Failed to send Error event: {}", e);
} }
break 'outer; abort_loop = true;
}
Ok(middleware::ToolCallDecision::ReplaceInput(new_input)) => {
// Execute with replaced input (same path_validator logic below)
let pv = path_validator.clone().unwrap_or_else(|| {
let home = std::env::var("USERPROFILE")
.or_else(|_| std::env::var("HOME"))
.unwrap_or_else(|_| ".".to_string());
PathValidator::new().with_workspace(std::path::PathBuf::from(&home))
});
let working_dir = pv.workspace_root()
.map(|p| p.to_string_lossy().to_string());
let tool_context = ToolContext {
agent_id: agent_id.clone(),
working_directory: working_dir,
session_id: Some(session_id_clone.to_string()),
skill_executor: skill_executor.clone(),
hand_executor: hand_executor.clone(),
path_validator: Some(pv),
event_sender: Some(tx.clone()),
};
let (result, is_error) = if let Some(tool) = tools.get(&name) {
match tool.execute(new_input, &tool_context).await {
Ok(output) => {
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: output.clone() }).await {
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
}
(output, false)
}
Err(e) => {
let error_output = serde_json::json!({ "error": e.to_string() });
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
}
(error_output, true)
}
}
} else {
let error_output = serde_json::json!({ "error": format!("Unknown tool: {}", name) });
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
}
(error_output, true)
};
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), result, is_error));
continue;
} }
Err(e) => { Err(e) => {
tracing::error!("[AgentLoop] Middleware error for tool '{}': {}", name, e); tracing::error!("[AgentLoop] Middleware error for tool '{}': {}", name, e);
@@ -916,19 +966,23 @@ impl AgentLoop {
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e); tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
} }
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true)); messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), error_output, true));
continue;
} }
} }
} }
// Use pre-resolved path_validator (already has default fallback from create_tool_context logic) if abort_loop { break 'outer; }
if plans.is_empty() {
tracing::debug!("[AgentLoop] No tools to execute after middleware filtering");
break 'outer;
}
// Build shared tool context
let pv = path_validator.clone().unwrap_or_else(|| { let pv = path_validator.clone().unwrap_or_else(|| {
let home = std::env::var("USERPROFILE") let home = std::env::var("USERPROFILE")
.or_else(|_| std::env::var("HOME")) .or_else(|_| std::env::var("HOME"))
.unwrap_or_else(|_| ".".to_string()); .unwrap_or_else(|_| ".".to_string());
PathValidator::new().with_workspace(std::path::PathBuf::from(&home)) PathValidator::new().with_workspace(std::path::PathBuf::from(&home))
}); });
let working_dir = pv.workspace_root() let working_dir = pv.workspace_root().map(|p| p.to_string_lossy().to_string());
.map(|p| p.to_string_lossy().to_string());
let tool_context = ToolContext { let tool_context = ToolContext {
agent_id: agent_id.clone(), agent_id: agent_id.clone(),
working_directory: working_dir, working_directory: working_dir,
@@ -939,78 +993,120 @@ impl AgentLoop {
event_sender: Some(tx.clone()), event_sender: Some(tx.clone()),
}; };
let (result, is_error) = if let Some(tool) = tools.get(&name) { // Phase 2: Execute tools (parallel for ReadOnly, serial for others)
tracing::debug!("[AgentLoop] Tool '{}' found, executing...", name); let (parallel_plans, sequential_plans): (Vec<_>, Vec<_>) = plans.iter()
match tool.execute(input.clone(), &tool_context).await { .partition(|p| {
Ok(output) => { tools.get(&p.name)
tracing::debug!("[AgentLoop] Tool '{}' executed successfully: {:?}", name, output); .map(|t| t.concurrency())
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: output.clone() }).await { .unwrap_or(ToolConcurrency::Exclusive) == ToolConcurrency::ReadOnly
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e); });
let mut results: std::collections::HashMap<usize, (String, String, serde_json::Value, bool)> = std::collections::HashMap::new();
// Execute parallel (ReadOnly) tools with JoinSet (max 3 concurrent)
if !parallel_plans.is_empty() {
let sem = Arc::new(tokio::sync::Semaphore::new(3));
let mut join_set = tokio::task::JoinSet::new();
for plan in &parallel_plans {
let tool_ctx = tool_context.clone();
let input = plan.input.clone();
let idx = plan.idx;
let id = plan.id.clone();
let name = plan.name.clone();
let tools_ref = tools.clone();
let permit = sem.clone().acquire_owned().await.unwrap();
join_set.spawn(async move {
let result = if let Some(tool) = tools_ref.get(&name) {
tokio::time::timeout(std::time::Duration::from_secs(30), tool.execute(input, &tool_ctx)).await
} else {
Ok(Err(zclaw_types::ZclawError::Internal(format!("Unknown tool: {}", name))))
};
drop(permit);
(idx, id, name, result)
});
} }
(output, false) while let Some(res) = join_set.join_next().await {
match res {
Ok((idx, id, name, Ok(Ok(value)))) => {
results.insert(idx, (id, name, value, false));
}
Ok((idx, id, name, Ok(Err(e)))) => {
results.insert(idx, (id, name, serde_json::json!({ "error": e.to_string() }), true));
}
Ok((idx, id, name, Err(_))) => {
tracing::warn!("[AgentLoop] Tool '{}' timed out (parallel, 30s)", name);
results.insert(idx, (id, name.clone(), serde_json::json!({ "error": format!("工具 '{}' 执行超时", name) }), true));
} }
Err(e) => { Err(e) => {
tracing::error!("[AgentLoop] Tool '{}' execution failed: {}", name, e); tracing::warn!("[AgentLoop] JoinError in parallel tool execution: {}", e);
let error_output = serde_json::json!({ "error": e.to_string() });
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
} }
(error_output, true)
} }
} }
}
// Execute sequential (Exclusive/Interactive) tools
for plan in &sequential_plans {
let (result, is_error) = if let Some(tool) = tools.get(&plan.name) {
match tool.execute(plan.input.clone(), &tool_context).await {
Ok(output) => (output, false),
Err(e) => (serde_json::json!({ "error": e.to_string() }), true),
}
} else { } else {
tracing::error!("[AgentLoop] Tool '{}' not found in registry", name); (serde_json::json!({ "error": format!("Unknown tool: {}", plan.name) }), true)
let error_output = serde_json::json!({ "error": format!("Unknown tool: {}", name) });
if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: error_output.clone() }).await {
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
}
(error_output, true)
}; };
// Check if this is a clarification response — break outer loop // Check clarification (only from sequential tools — ask_clarification is Interactive)
if name == "ask_clarification" if plan.name == "ask_clarification"
&& result.get("status").and_then(|v| v.as_str()) == Some("clarification_needed") && result.get("status").and_then(|v| v.as_str()) == Some("clarification_needed")
{ {
tracing::info!("[AgentLoop] Streaming: Clarification requested, terminating loop"); tracing::info!("[AgentLoop] Streaming: Clarification requested, terminating loop");
let question = result.get("question") let question = result.get("question").and_then(|v| v.as_str()).unwrap_or("需要更多信息").to_string();
.and_then(|v| v.as_str()) messages.push(Message::tool_result(plan.id.clone(), zclaw_types::ToolId::new(&plan.name), result, is_error));
.unwrap_or("需要更多信息") if let Err(e) = tx.send(LoopEvent::Delta(question.clone())).await { tracing::warn!("{}", e); }
.to_string(); if let Err(e) = tx.send(LoopEvent::Complete(AgentLoopResult { response: question.clone(), input_tokens: total_input_tokens, output_tokens: total_output_tokens, iterations: iteration })).await { tracing::warn!("{}", e); }
messages.push(Message::tool_result( if let Err(e) = memory.append_message(&session_id_clone, &Message::assistant(&question)).await { tracing::warn!("{}", e); }
id,
zclaw_types::ToolId::new(&name),
result,
is_error,
));
// Send the question as final delta so the user sees it
if let Err(e) = tx.send(LoopEvent::Delta(question.clone())).await {
tracing::warn!("[AgentLoop] Failed to send Delta event: {}", e);
}
if let Err(e) = tx.send(LoopEvent::Complete(AgentLoopResult {
response: question.clone(),
input_tokens: total_input_tokens,
output_tokens: total_output_tokens,
iterations: iteration,
})).await {
tracing::warn!("[AgentLoop] Failed to send Complete event: {}", e);
}
if let Err(e) = memory.append_message(&session_id_clone, &Message::assistant(&question)).await {
tracing::warn!("[AgentLoop] Failed to save clarification message: {}", e);
}
break 'outer; break 'outer;
} }
results.insert(plan.idx, (plan.id.clone(), plan.name.clone(), result, is_error));
}
// Add tool result to message history // Phase 3: after_tool_call middleware + push results in original order
tracing::debug!("[AgentLoop] Adding tool_result to history: id={}, name={}, is_error={}", id, name, is_error); let mut sorted_indices: Vec<usize> = results.keys().copied().collect();
messages.push(Message::tool_result( sorted_indices.sort();
id, for idx in sorted_indices {
zclaw_types::ToolId::new(&name), let (id, name, result, is_error) = results.remove(&idx).unwrap();
result,
is_error, // Emit ToolEnd event
)); if let Err(e) = tx.send(LoopEvent::ToolEnd { name: name.clone(), output: result.clone() }).await {
tracing::warn!("[AgentLoop] Failed to send ToolEnd event: {}", e);
}
// Run after_tool_call middleware
{
let mut mw_ctx = middleware::MiddlewareContext {
agent_id: agent_id.clone(),
session_id: session_id_clone.clone(),
user_input: String::new(),
system_prompt: enhanced_prompt.clone(),
messages: messages.clone(),
response_content: Vec::new(),
input_tokens: total_input_tokens,
output_tokens: total_output_tokens,
};
if let Err(e) = middleware_chain.run_after_tool_call(&mut mw_ctx, &name, &result).await {
tracing::warn!("[AgentLoop] after_tool_call middleware failed for '{}': {}", name, e);
}
}
messages.push(Message::tool_result(id, zclaw_types::ToolId::new(&name), result, is_error));
} }
tracing::debug!("[AgentLoop] Continuing to next iteration for LLM to process tool results"); tracing::debug!("[AgentLoop] Continuing to next iteration for LLM to process tool results");
// If stream errored, we executed complete tools but cannot continue the LLM loop
if stream_errored {
tracing::info!("[AgentLoop] Stream was errored — executed salvageable tools, now breaking");
break 'outer;
}
// Continue loop - next iteration will call LLM with tool results // Continue loop - next iteration will call LLM with tool results
} }
}); });

View File

@@ -12,6 +12,13 @@
//! | 200-399 | Capability | SkillIndex, Guardrail | //! | 200-399 | Capability | SkillIndex, Guardrail |
//! | 400-599 | Safety | LoopGuard, Guardrail | //! | 400-599 | Safety | LoopGuard, Guardrail |
//! | 600-799 | Telemetry | TokenCalibration, Tracking | //! | 600-799 | Telemetry | TokenCalibration, Tracking |
//!
//! # Wave parallelization
//!
//! `before_completion` middlewares that only modify `system_prompt` (not `messages`)
//! can declare `parallel_safe() == true`. The chain runs consecutive parallel-safe
//! middlewares concurrently, merging their prompt contributions. This reduces
//! sequential latency for the context-injection phase.
use std::sync::Arc; use std::sync::Arc;
use async_trait::async_trait; use async_trait::async_trait;
@@ -50,6 +57,7 @@ pub enum ToolCallDecision {
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
/// Carries the mutable state that middleware may inspect or modify. /// Carries the mutable state that middleware may inspect or modify.
#[derive(Clone)]
pub struct MiddlewareContext { pub struct MiddlewareContext {
/// The agent that owns this loop. /// The agent that owns this loop.
pub agent_id: AgentId, pub agent_id: AgentId,
@@ -101,6 +109,15 @@ pub trait AgentMiddleware: Send + Sync {
500 500
} }
/// Whether `before_completion` is safe to run concurrently with other
/// parallel-safe middlewares. Only return `true` if the middleware:
/// - Only modifies `ctx.system_prompt` (never `ctx.messages`)
/// - Does not depend on prompt modifications from other middlewares
/// - Does not return `MiddlewareDecision::Stop`
fn parallel_safe(&self) -> bool {
false
}
/// Hook executed **before** the LLM completion request is sent. /// Hook executed **before** the LLM completion request is sent.
/// ///
/// Use this to inject context (memory, skill index, etc.) or to /// Use this to inject context (memory, skill index, etc.) or to
@@ -163,9 +180,66 @@ impl MiddlewareChain {
self.middlewares.insert(pos, mw); self.middlewares.insert(pos, mw);
} }
/// Run all `before_completion` hooks in order. /// Run all `before_completion` hooks with wave-based parallelization.
///
/// Consecutive `parallel_safe` middlewares run concurrently — each gets
/// its own cloned context and appends to `system_prompt` independently.
/// Their contributions are merged after all complete. Non-parallel-safe
/// middlewares (and non-consecutive ones) run sequentially as before.
pub async fn run_before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> { pub async fn run_before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> {
for mw in &self.middlewares { let mut idx = 0;
while idx < self.middlewares.len() {
// Find the extent of consecutive parallel-safe middlewares
let wave_start = idx;
let mut wave_end = idx;
while wave_end < self.middlewares.len()
&& self.middlewares[wave_end].parallel_safe()
{
wave_end += 1;
}
if wave_end - wave_start >= 2 {
// Run parallel wave (2+ consecutive parallel-safe middlewares)
let base_prompt_len = ctx.system_prompt.len();
let wave = &self.middlewares[wave_start..wave_end];
// Spawn concurrent tasks — each owns its cloned context + Arc ref to middleware
let mut join_handles = Vec::with_capacity(wave.len());
for mw in wave.iter() {
let mut ctx_clone = ctx.clone();
let mw_arc = Arc::clone(mw);
join_handles.push(tokio::spawn(async move {
let result = mw_arc.before_completion(&mut ctx_clone).await;
(result, ctx_clone.system_prompt)
}));
}
// Await all and merge prompt contributions
for (i, handle) in join_handles.into_iter().enumerate() {
let (result, modified_prompt): (Result<MiddlewareDecision>, String) = handle.await
.map_err(|e| zclaw_types::ZclawError::Internal(format!("Parallel middleware panicked: {}", e)))?;
match result? {
MiddlewareDecision::Continue => {}
MiddlewareDecision::Stop(reason) => {
tracing::info!(
"[MiddlewareChain] '{}' requested stop: {}",
self.middlewares[wave_start + i].name(),
reason
);
return Ok(MiddlewareDecision::Stop(reason));
}
}
// Merge system_prompt contribution from this clone
if modified_prompt.len() > base_prompt_len {
let contribution = &modified_prompt[base_prompt_len..];
ctx.system_prompt.push_str(contribution);
}
}
idx = wave_end;
} else {
// Run single middleware sequentially
let mw = &self.middlewares[idx];
match mw.before_completion(ctx).await? { match mw.before_completion(ctx).await? {
MiddlewareDecision::Continue => {} MiddlewareDecision::Continue => {}
MiddlewareDecision::Stop(reason) => { MiddlewareDecision::Stop(reason) => {
@@ -173,6 +247,8 @@ impl MiddlewareChain {
return Ok(MiddlewareDecision::Stop(reason)); return Ok(MiddlewareDecision::Stop(reason));
} }
} }
idx += 1;
}
} }
Ok(MiddlewareDecision::Continue) Ok(MiddlewareDecision::Continue)
} }

View File

@@ -290,6 +290,8 @@ impl AgentMiddleware for ButlerRouterMiddleware {
80 80
} }
fn parallel_safe(&self) -> bool { true }
async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> { async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> {
// Only route on the first user message in a turn (not tool results) // Only route on the first user message in a turn (not tool results)
let user_input = &ctx.user_input; let user_input = &ctx.user_input;

View File

@@ -1,21 +1,49 @@
//! Compaction middleware — wraps the existing compaction module. //! Compaction middleware — wraps the existing compaction module.
//!
//! Supports debounce (cooldown + min-round checks), async LLM compression
//! with cached fallback, and iterative summaries that carry forward key info.
use async_trait::async_trait; use async_trait::async_trait;
use zclaw_types::Result; use std::sync::atomic::{AtomicU64, Ordering};
use crate::middleware::{AgentMiddleware, MiddlewareContext, MiddlewareDecision};
use crate::compaction::{self, CompactionConfig};
use crate::growth::GrowthIntegration;
use crate::driver::LlmDriver;
use std::sync::Arc; use std::sync::Arc;
use tokio::sync::RwLock;
use zclaw_types::{Message, Result};
use crate::compaction::{self, CompactionConfig};
use crate::driver::LlmDriver;
use crate::growth::GrowthIntegration;
use crate::middleware::{AgentMiddleware, MiddlewareContext, MiddlewareDecision};
/// Minimum seconds between consecutive compactions.
const COMPACTION_COOLDOWN_SECS: u64 = 30;
/// Minimum message pairs (user+assistant) since last compaction before triggering again.
const COMPACTION_MIN_ROUNDS: u64 = 3;
fn now_millis() -> u64 {
std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap_or_default()
.as_millis() as u64
}
/// Shared compaction debounce state (lock-free).
struct CompactionState {
last_compaction_ms: AtomicU64,
last_compaction_msg_count: AtomicU64,
}
/// Cached result from a previous async LLM compaction.
struct AsyncCompactionCache {
last_result: RwLock<Option<Vec<Message>>>,
}
/// Middleware that compresses conversation history when it exceeds a token threshold. /// Middleware that compresses conversation history when it exceeds a token threshold.
pub struct CompactionMiddleware { pub struct CompactionMiddleware {
threshold: usize, threshold: usize,
config: CompactionConfig, config: CompactionConfig,
/// Optional LLM driver for async compaction (LLM summarisation, memory flush).
driver: Option<Arc<dyn LlmDriver>>, driver: Option<Arc<dyn LlmDriver>>,
/// Optional growth integration for memory flushing during compaction.
growth: Option<GrowthIntegration>, growth: Option<GrowthIntegration>,
state: Arc<CompactionState>,
cache: Arc<AsyncCompactionCache>,
} }
impl CompactionMiddleware { impl CompactionMiddleware {
@@ -25,7 +53,39 @@ impl CompactionMiddleware {
driver: Option<Arc<dyn LlmDriver>>, driver: Option<Arc<dyn LlmDriver>>,
growth: Option<GrowthIntegration>, growth: Option<GrowthIntegration>,
) -> Self { ) -> Self {
Self { threshold, config, driver, growth } Self {
threshold,
config,
driver,
growth,
state: Arc::new(CompactionState {
last_compaction_ms: AtomicU64::new(0),
last_compaction_msg_count: AtomicU64::new(0),
}),
cache: Arc::new(AsyncCompactionCache {
last_result: RwLock::new(None),
}),
}
}
fn should_compact(&self, msg_count: u64) -> bool {
let last_ms = self.state.last_compaction_ms.load(Ordering::Relaxed);
let last_count = self.state.last_compaction_msg_count.load(Ordering::Relaxed);
if now_millis().saturating_sub(last_ms) < COMPACTION_COOLDOWN_SECS * 1000 {
return false;
}
if msg_count.saturating_sub(last_count) < COMPACTION_MIN_ROUNDS * 2 {
return false;
}
true
}
fn record_compaction(&self, msg_count: u64) {
self.state.last_compaction_ms.store(now_millis(), Ordering::Relaxed);
self.state.last_compaction_msg_count.store(msg_count, Ordering::Relaxed);
} }
} }
@@ -39,6 +99,29 @@ impl AgentMiddleware for CompactionMiddleware {
return Ok(MiddlewareDecision::Continue); return Ok(MiddlewareDecision::Continue);
} }
// Step 1: Prune old tool outputs (cheap, no LLM needed)
let pruned = compaction::prune_tool_outputs(&mut ctx.messages);
if pruned > 0 {
tracing::info!("[CompactionMiddleware] Pruned {} old tool outputs", pruned);
}
// Step 2: Re-estimate tokens after pruning
let tokens = compaction::estimate_messages_tokens_calibrated(&ctx.messages);
if tokens < self.threshold {
return Ok(MiddlewareDecision::Continue);
}
// Step 3: Debounce check
if !self.should_compact(ctx.messages.len() as u64) {
// Still over threshold but within cooldown — use cached result if available
if let Some(cached) = self.cache.last_result.read().await.clone() {
tracing::debug!("[CompactionMiddleware] Cooldown active, using cached compaction result");
ctx.messages = cached;
}
return Ok(MiddlewareDecision::Continue);
}
// Step 4: Execute compaction
let needs_async = self.config.use_llm || self.config.memory_flush_enabled; let needs_async = self.config.use_llm || self.config.memory_flush_enabled;
if needs_async { if needs_async {
let outcome = compaction::maybe_compact_with_config( let outcome = compaction::maybe_compact_with_config(
@@ -56,6 +139,14 @@ impl AgentMiddleware for CompactionMiddleware {
ctx.messages = compaction::maybe_compact(ctx.messages.clone(), self.threshold); ctx.messages = compaction::maybe_compact(ctx.messages.clone(), self.threshold);
} }
self.record_compaction(ctx.messages.len() as u64);
// Cache result for cooldown fallback
{
let mut cache = self.cache.last_result.write().await;
*cache = Some(ctx.messages.clone());
}
Ok(MiddlewareDecision::Continue) Ok(MiddlewareDecision::Continue)
} }
} }

View File

@@ -88,6 +88,8 @@ impl AgentMiddleware for EvolutionMiddleware {
78 // 在 ButlerRouter(80) 之前 78 // 在 ButlerRouter(80) 之前
} }
fn parallel_safe(&self) -> bool { true }
async fn before_completion( async fn before_completion(
&self, &self,
ctx: &mut MiddlewareContext, ctx: &mut MiddlewareContext,

View File

@@ -111,6 +111,7 @@ impl MemoryMiddleware {
impl AgentMiddleware for MemoryMiddleware { impl AgentMiddleware for MemoryMiddleware {
fn name(&self) -> &str { "memory" } fn name(&self) -> &str { "memory" }
fn priority(&self) -> i32 { 150 } fn priority(&self) -> i32 { 150 }
fn parallel_safe(&self) -> bool { true }
async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> { async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> {
tracing::debug!( tracing::debug!(

View File

@@ -40,6 +40,7 @@ impl SkillIndexMiddleware {
impl AgentMiddleware for SkillIndexMiddleware { impl AgentMiddleware for SkillIndexMiddleware {
fn name(&self) -> &str { "skill_index" } fn name(&self) -> &str { "skill_index" }
fn priority(&self) -> i32 { 200 } fn priority(&self) -> i32 { 200 }
fn parallel_safe(&self) -> bool { true }
async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> { async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision> {
if self.entries.is_empty() { if self.entries.is_empty() {

View File

@@ -41,6 +41,7 @@ impl Default for TitleMiddleware {
impl AgentMiddleware for TitleMiddleware { impl AgentMiddleware for TitleMiddleware {
fn name(&self) -> &str { "title" } fn name(&self) -> &str { "title" }
fn priority(&self) -> i32 { 180 } fn priority(&self) -> i32 { 180 }
fn parallel_safe(&self) -> bool { true }
// All hooks default to Continue — placeholder until LLM driver is wired in. // All hooks default to Continue — placeholder until LLM driver is wired in.
async fn before_completion(&self, _ctx: &mut crate::middleware::MiddlewareContext) -> zclaw_types::Result<MiddlewareDecision> { async fn before_completion(&self, _ctx: &mut crate::middleware::MiddlewareContext) -> zclaw_types::Result<MiddlewareDecision> {

View File

@@ -13,6 +13,7 @@ use serde_json::Value;
use zclaw_types::Result; use zclaw_types::Result;
use crate::driver::ContentBlock; use crate::driver::ContentBlock;
use crate::middleware::{AgentMiddleware, MiddlewareContext, ToolCallDecision}; use crate::middleware::{AgentMiddleware, MiddlewareContext, ToolCallDecision};
use std::collections::HashMap;
use std::sync::Mutex; use std::sync::Mutex;
/// Middleware that intercepts tool call errors and formats recovery messages. /// Middleware that intercepts tool call errors and formats recovery messages.
@@ -23,8 +24,8 @@ pub struct ToolErrorMiddleware {
max_error_length: usize, max_error_length: usize,
/// Maximum consecutive failures before aborting the loop. /// Maximum consecutive failures before aborting the loop.
max_consecutive_failures: u32, max_consecutive_failures: u32,
/// Tracks consecutive tool failures. /// Tracks consecutive tool failures per session.
consecutive_failures: Mutex<u32>, session_failures: Mutex<HashMap<String, u32>>,
} }
impl ToolErrorMiddleware { impl ToolErrorMiddleware {
@@ -32,7 +33,7 @@ impl ToolErrorMiddleware {
Self { Self {
max_error_length: 500, max_error_length: 500,
max_consecutive_failures: 3, max_consecutive_failures: 3,
consecutive_failures: Mutex::new(0), session_failures: Mutex::new(HashMap::new()),
} }
} }
@@ -66,7 +67,7 @@ impl AgentMiddleware for ToolErrorMiddleware {
async fn before_tool_call( async fn before_tool_call(
&self, &self,
_ctx: &MiddlewareContext, ctx: &MiddlewareContext,
tool_name: &str, tool_name: &str,
tool_input: &Value, tool_input: &Value,
) -> Result<ToolCallDecision> { ) -> Result<ToolCallDecision> {
@@ -79,15 +80,17 @@ impl AgentMiddleware for ToolErrorMiddleware {
return Ok(ToolCallDecision::ReplaceInput(serde_json::json!({}))); return Ok(ToolCallDecision::ReplaceInput(serde_json::json!({})));
} }
// Check consecutive failure count — abort if too many failures // Check consecutive failure count — abort if too many failures (per session)
let failures = self.consecutive_failures.lock().unwrap_or_else(|e| e.into_inner()); let failures = self.session_failures.lock()
if *failures >= self.max_consecutive_failures { .map(|m| m.get(&ctx.session_id.to_string()).copied().unwrap_or(0))
.unwrap_or(0);
if failures >= self.max_consecutive_failures {
tracing::warn!( tracing::warn!(
"[ToolErrorMiddleware] Aborting loop: {} consecutive tool failures", "[ToolErrorMiddleware] Aborting loop: {} consecutive tool failures",
*failures failures
); );
return Ok(ToolCallDecision::AbortLoop( return Ok(ToolCallDecision::AbortLoop(
format!("连续 {} 次工具调用失败,已自动终止以避免无限重试", *failures) format!("连续 {} 次工具调用失败,已自动终止以避免无限重试", failures)
)); ));
} }
@@ -100,11 +103,16 @@ impl AgentMiddleware for ToolErrorMiddleware {
tool_name: &str, tool_name: &str,
result: &Value, result: &Value,
) -> Result<()> { ) -> Result<()> {
let mut failures = self.consecutive_failures.lock().unwrap_or_else(|e| e.into_inner());
// Check if the tool result indicates an error. // Check if the tool result indicates an error.
if let Some(error) = result.get("error") { if let Some(error) = result.get("error") {
*failures += 1; let session_key = ctx.session_id.to_string();
let failures = self.session_failures.lock()
.map(|mut m| {
let count = m.entry(session_key.clone()).or_insert(0);
*count += 1;
*count
})
.unwrap_or(1);
let error_msg = match error { let error_msg = match error {
Value::String(s) => s.clone(), Value::String(s) => s.clone(),
other => other.to_string(), other => other.to_string(),
@@ -118,7 +126,7 @@ impl AgentMiddleware for ToolErrorMiddleware {
tracing::warn!( tracing::warn!(
"[ToolErrorMiddleware] Tool '{}' failed ({}/{} consecutive): {}", "[ToolErrorMiddleware] Tool '{}' failed ({}/{} consecutive): {}",
tool_name, *failures, self.max_consecutive_failures, truncated tool_name, failures, self.max_consecutive_failures, truncated
); );
let guided_message = self.format_tool_error(tool_name, &truncated); let guided_message = self.format_tool_error(tool_name, &truncated);
@@ -126,8 +134,11 @@ impl AgentMiddleware for ToolErrorMiddleware {
text: guided_message, text: guided_message,
}); });
} else { } else {
// Success — reset consecutive failure counter // Success — reset consecutive failure counter for this session
*failures = 0; let session_key = ctx.session_id.to_string();
if let Ok(mut m) = self.session_failures.lock() {
m.insert(session_key, 0);
}
} }
Ok(()) Ok(())

View File

@@ -21,35 +21,27 @@ use crate::middleware::{AgentMiddleware, MiddlewareContext, ToolCallDecision};
/// Maximum safe output length in characters. /// Maximum safe output length in characters.
const MAX_OUTPUT_LENGTH: usize = 50_000; const MAX_OUTPUT_LENGTH: usize = 50_000;
/// Patterns that indicate sensitive information in tool output. /// Regex patterns that match actual secret values (not just keywords).
const SENSITIVE_PATTERNS: &[&str] = &[ /// These detect the *value format* of secrets, avoiding false positives
"api_key", /// from legitimate content that merely mentions "password" or "api_key".
"apikey", const SECRET_VALUE_PATTERNS: &[&str] = &[
"api-key", r#"sk-[a-zA-Z0-9]{20,}"#, // OpenAI API keys (sk-xxx, 20+ chars)
"secret_key", r#"sk_live_[a-zA-Z0-9]{20,}"#, // Stripe live keys
"secretkey", r#"sk_test_[a-zA-Z0-9]{20,}"#, // Stripe test keys
"access_token", r#"AKIA[A-Z0-9]{16}"#, // AWS access keys (exact 20 chars)
"auth_token", r#"-----BEGIN (RSA |EC )?PRIVATE KEY-----"#, // PEM private keys
"password", r#"(?:api_?key|secret_?key|access_?token|auth_?token|password)\s*[:=]\s*["'][^"']{8,}["']"#, // key=value with actual secret
"private_key",
"-----BEGIN RSA",
"-----BEGIN PRIVATE",
"sk-", // OpenAI API keys
"sk_live_", // Stripe keys
"AKIA", // AWS access keys
]; ];
/// Patterns that may indicate prompt injection in tool output. /// Keyword patterns that indicate prompt injection in tool output.
/// These are specific enough to avoid false positives from normal content.
const INJECTION_PATTERNS: &[&str] = &[ const INJECTION_PATTERNS: &[&str] = &[
"ignore previous instructions", "ignore previous instructions",
"ignore all previous", "ignore all previous",
"disregard your instructions", "disregard your instructions",
"you are now",
"new instructions:", "new instructions:",
"system:",
"[INST]", "[INST]",
"</scratchpad>", "</scratchpad>",
"think step by step about",
]; ];
/// Tool output sanitization middleware. /// Tool output sanitization middleware.
@@ -105,22 +97,24 @@ impl AgentMiddleware for ToolOutputGuardMiddleware {
); );
} }
// Rule 2: Sensitive information detection — block output containing secrets (P2-22) // Rule 2: Sensitive information detection — match actual secret values, not keywords
let output_lower = output_str.to_lowercase(); for pattern in SECRET_VALUE_PATTERNS {
for pattern in SENSITIVE_PATTERNS { if let Ok(re) = regex::Regex::new(pattern) {
if output_lower.contains(pattern) { if re.is_match(&output_str) {
tracing::error!( tracing::error!(
"[ToolOutputGuard] BLOCKED tool '{}' output: sensitive pattern '{}'", "[ToolOutputGuard] BLOCKED tool '{}' output: secret value matched pattern '{}'",
tool_name, pattern tool_name, pattern
); );
return Err(zclaw_types::ZclawError::Internal(format!( return Err(zclaw_types::ZclawError::Internal(format!(
"[ToolOutputGuard] Tool '{}' output blocked: sensitive information detected ('{}')", "[ToolOutputGuard] Tool '{}' output blocked: sensitive information detected",
tool_name, pattern tool_name
))); )));
} }
} }
}
// Rule 3: Injection marker detection — BLOCK the output (P2-22 fix) // Rule 3: Injection marker detection — specific phrase matching
let output_lower = output_str.to_lowercase();
for pattern in INJECTION_PATTERNS { for pattern in INJECTION_PATTERNS {
if output_lower.contains(pattern) { if output_lower.contains(pattern) {
tracing::error!( tracing::error!(

View File

@@ -24,6 +24,10 @@ pub enum StreamChunk {
input_tokens: u32, input_tokens: u32,
output_tokens: u32, output_tokens: u32,
stop_reason: String, stop_reason: String,
#[serde(default)]
cache_creation_input_tokens: Option<u32>,
#[serde(default)]
cache_read_input_tokens: Option<u32>,
}, },
/// Error occurred /// Error occurred
Error { message: String }, Error { message: String },

View File

@@ -55,6 +55,8 @@ impl MockLlmDriver {
input_tokens: 10, input_tokens: 10,
output_tokens: text.len() as u32 / 4, output_tokens: text.len() as u32 / 4,
stop_reason: StopReason::EndTurn, stop_reason: StopReason::EndTurn,
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}); });
self self
} }
@@ -74,6 +76,8 @@ impl MockLlmDriver {
input_tokens: 10, input_tokens: 10,
output_tokens: 20, output_tokens: 20,
stop_reason: StopReason::ToolUse, stop_reason: StopReason::ToolUse,
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}); });
self self
} }
@@ -86,6 +90,8 @@ impl MockLlmDriver {
input_tokens: 0, input_tokens: 0,
output_tokens: 0, output_tokens: 0,
stop_reason: StopReason::Error, stop_reason: StopReason::Error,
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}); });
self self
} }
@@ -142,6 +148,8 @@ impl MockLlmDriver {
input_tokens: 0, input_tokens: 0,
output_tokens: 0, output_tokens: 0,
stop_reason: StopReason::EndTurn, stop_reason: StopReason::EndTurn,
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}) })
} }
} }
@@ -190,6 +198,8 @@ impl LlmDriver for MockLlmDriver {
input_tokens: 10, input_tokens: 10,
output_tokens: 2, output_tokens: 2,
stop_reason: "end_turn".to_string(), stop_reason: "end_turn".to_string(),
cache_creation_input_tokens: None,
cache_read_input_tokens: None,
}, },
] ]
}) })

View File

@@ -11,6 +11,17 @@ use crate::driver::ToolDefinition;
use crate::loop_runner::LoopEvent; use crate::loop_runner::LoopEvent;
use crate::tool::builtin::PathValidator; use crate::tool::builtin::PathValidator;
/// Tool concurrency safety level
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum ToolConcurrency {
/// Read-only operations, always safe to parallelize (file_read, web_fetch, etc.)
ReadOnly,
/// Exclusive operations, must be serial (file_write, shell_exec, etc.)
Exclusive,
/// Interactive operations, never parallelize (ask_clarification, etc.)
Interactive,
}
/// Tool trait for implementing agent tools /// Tool trait for implementing agent tools
#[async_trait] #[async_trait]
pub trait Tool: Send + Sync { pub trait Tool: Send + Sync {
@@ -25,6 +36,11 @@ pub trait Tool: Send + Sync {
/// Execute the tool /// Execute the tool
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value>; async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value>;
/// Tool concurrency safety level. Default: ReadOnly.
fn concurrency(&self) -> ToolConcurrency {
ToolConcurrency::ReadOnly
}
} }
/// Skill executor trait for runtime skill execution /// Skill executor trait for runtime skill execution

View File

@@ -9,7 +9,7 @@ use async_trait::async_trait;
use serde_json::{json, Value}; use serde_json::{json, Value};
use zclaw_types::{Result, ZclawError}; use zclaw_types::{Result, ZclawError};
use crate::tool::{Tool, ToolContext}; use crate::tool::{Tool, ToolContext, ToolConcurrency};
/// Clarification type — categorizes the reason for asking. /// Clarification type — categorizes the reason for asking.
#[derive(Debug, Clone, PartialEq)] #[derive(Debug, Clone, PartialEq)]
@@ -96,6 +96,10 @@ impl Tool for AskClarificationTool {
}) })
} }
fn concurrency(&self) -> ToolConcurrency {
ToolConcurrency::Interactive
}
async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> { async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> {
let question = input["question"].as_str() let question = input["question"].as_str()
.ok_or_else(|| ZclawError::InvalidInput("Missing 'question' parameter".into()))?; .ok_or_else(|| ZclawError::InvalidInput("Missing 'question' parameter".into()))?;

View File

@@ -4,7 +4,7 @@ use async_trait::async_trait;
use serde_json::{json, Value}; use serde_json::{json, Value};
use zclaw_types::{Result, ZclawError}; use zclaw_types::{Result, ZclawError};
use crate::tool::{Tool, ToolContext}; use crate::tool::{Tool, ToolContext, ToolConcurrency};
pub struct ExecuteSkillTool; pub struct ExecuteSkillTool;
@@ -42,6 +42,10 @@ impl Tool for ExecuteSkillTool {
}) })
} }
fn concurrency(&self) -> ToolConcurrency {
ToolConcurrency::Exclusive
}
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> { async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> {
let skill_id = input["skill_id"].as_str() let skill_id = input["skill_id"].as_str()
.ok_or_else(|| ZclawError::InvalidInput("Missing 'skill_id' parameter".into()))?; .ok_or_else(|| ZclawError::InvalidInput("Missing 'skill_id' parameter".into()))?;

View File

@@ -6,7 +6,7 @@ use zclaw_types::{Result, ZclawError};
use std::fs; use std::fs;
use std::io::Write; use std::io::Write;
use crate::tool::{Tool, ToolContext}; use crate::tool::{Tool, ToolContext, ToolConcurrency};
use super::path_validator::PathValidator; use super::path_validator::PathValidator;
pub struct FileWriteTool; pub struct FileWriteTool;
@@ -55,6 +55,10 @@ impl Tool for FileWriteTool {
}) })
} }
fn concurrency(&self) -> ToolConcurrency {
ToolConcurrency::Exclusive
}
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> { async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> {
let path = input["path"].as_str() let path = input["path"].as_str()
.ok_or_else(|| ZclawError::InvalidInput("Missing 'path' parameter".into()))?; .ok_or_else(|| ZclawError::InvalidInput("Missing 'path' parameter".into()))?;

View File

@@ -8,7 +8,7 @@ use serde_json::Value;
use std::sync::Arc; use std::sync::Arc;
use zclaw_types::Result; use zclaw_types::Result;
use crate::tool::{Tool, ToolContext}; use crate::tool::{Tool, ToolContext, ToolConcurrency};
/// Wraps an MCP tool adapter into the `Tool` trait. /// Wraps an MCP tool adapter into the `Tool` trait.
/// ///
@@ -42,6 +42,10 @@ impl Tool for McpToolWrapper {
self.adapter.input_schema().clone() self.adapter.input_schema().clone()
} }
fn concurrency(&self) -> ToolConcurrency {
ToolConcurrency::Exclusive
}
async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> { async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> {
self.adapter.execute(input).await self.adapter.execute(input).await
} }

View File

@@ -97,6 +97,17 @@ fn default_blocked_paths() -> Vec<PathBuf> {
] ]
} }
/// Normalize Windows UNC path prefix for consistent comparison.
/// `\\?\C:\Users\...` → `C:\Users\...`
fn normalize_windows_path(path: &Path) -> std::borrow::Cow<'_, Path> {
let s = path.to_string_lossy();
if s.starts_with(r"\\?\") {
std::borrow::Cow::Owned(PathBuf::from(&s[4..]))
} else {
std::borrow::Cow::Borrowed(path)
}
}
/// Expand tilde in path to home directory /// Expand tilde in path to home directory
fn expand_tilde(path: &str) -> PathBuf { fn expand_tilde(path: &str) -> PathBuf {
if path.starts_with('~') { if path.starts_with('~') {
@@ -154,9 +165,16 @@ impl PathValidator {
} }
} }
/// Set the workspace root directory /// Set the workspace root directory.
/// Canonicalizes the path to ensure consistent comparison on Windows
/// (where canonicalize() returns `\\?\C:\...` UNC paths).
pub fn with_workspace(mut self, workspace: PathBuf) -> Self { pub fn with_workspace(mut self, workspace: PathBuf) -> Self {
self.workspace_root = Some(workspace); let canonical = if workspace.exists() {
workspace.canonicalize().unwrap_or(workspace)
} else {
workspace
};
self.workspace_root = Some(canonical);
self self
} }
@@ -230,7 +248,14 @@ impl PathValidator {
fn resolve_and_validate(&self, path: &str) -> Result<PathBuf> { fn resolve_and_validate(&self, path: &str) -> Result<PathBuf> {
// Expand tilde // Expand tilde
let expanded = expand_tilde(path); let expanded = expand_tilde(path);
let path_buf = PathBuf::from(&expanded); let mut path_buf = PathBuf::from(&expanded);
// If relative path and workspace is configured, resolve against workspace
if path_buf.is_relative() {
if let Some(ref workspace) = self.workspace_root {
path_buf = workspace.join(&path_buf);
}
}
// Check for path traversal // Check for path traversal
self.check_path_traversal(&path_buf)?; self.check_path_traversal(&path_buf)?;
@@ -280,10 +305,14 @@ impl PathValidator {
Ok(()) Ok(())
} }
/// Check if path is in blocked list /// Check if path is in blocked list.
/// Normalizes Windows UNC prefix (`\\?\`) for consistent comparison.
fn check_blocked(&self, path: &Path) -> Result<()> { fn check_blocked(&self, path: &Path) -> Result<()> {
// Strip Windows UNC prefix for consistent matching
let normalized = normalize_windows_path(path);
for blocked in &self.config.blocked_paths { for blocked in &self.config.blocked_paths {
if path.starts_with(blocked) || path == blocked { let blocked_norm = normalize_windows_path(blocked);
if normalized.starts_with(&*blocked_norm) || normalized == blocked_norm {
return Err(ZclawError::InvalidInput(format!( return Err(ZclawError::InvalidInput(format!(
"Access to this path is blocked: {}", "Access to this path is blocked: {}",
path.display() path.display()
@@ -303,11 +332,15 @@ impl PathValidator {
/// - This prevents accidental exposure of the entire filesystem /// - This prevents accidental exposure of the entire filesystem
/// when the validator is misconfigured or used without setup /// when the validator is misconfigured or used without setup
fn check_allowed(&self, path: &Path) -> Result<()> { fn check_allowed(&self, path: &Path) -> Result<()> {
let path_norm = normalize_windows_path(path);
// If no allowed paths specified, check workspace // If no allowed paths specified, check workspace
if self.config.allowed_paths.is_empty() { if self.config.allowed_paths.is_empty() {
if let Some(ref workspace) = self.workspace_root { if let Some(ref workspace) = self.workspace_root {
// Workspace is configured - validate path is within it // Workspace is configured - validate path is within it
if !path.starts_with(workspace) { // Both sides are canonicalized (workspace via with_workspace, path via resolve_and_validate)
let ws_norm = normalize_windows_path(workspace);
if !path_norm.starts_with(&*ws_norm) {
return Err(ZclawError::InvalidInput(format!( return Err(ZclawError::InvalidInput(format!(
"Path outside workspace: {} (workspace: {})", "Path outside workspace: {} (workspace: {})",
path.display(), path.display(),
@@ -329,7 +362,8 @@ impl PathValidator {
// Check against allowed paths // Check against allowed paths
for allowed in &self.config.allowed_paths { for allowed in &self.config.allowed_paths {
if path.starts_with(allowed) { let allowed_norm = normalize_windows_path(allowed);
if path_norm.starts_with(&*allowed_norm) {
return Ok(()); return Ok(());
} }
} }

View File

@@ -8,7 +8,7 @@ use std::process::{Command, Stdio};
use std::time::{Duration, Instant}; use std::time::{Duration, Instant};
use zclaw_types::{Result, ZclawError}; use zclaw_types::{Result, ZclawError};
use crate::tool::{Tool, ToolContext}; use crate::tool::{Tool, ToolContext, ToolConcurrency};
/// Parse a command string into program and arguments using proper shell quoting /// Parse a command string into program and arguments using proper shell quoting
fn parse_command(command: &str) -> Result<(String, Vec<String>)> { fn parse_command(command: &str) -> Result<(String, Vec<String>)> {
@@ -175,6 +175,10 @@ impl Tool for ShellExecTool {
}) })
} }
fn concurrency(&self) -> ToolConcurrency {
ToolConcurrency::Exclusive
}
async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> { async fn execute(&self, input: Value, _context: &ToolContext) -> Result<Value> {
let command = input["command"].as_str() let command = input["command"].as_str()
.ok_or_else(|| ZclawError::InvalidInput("Missing 'command' parameter".into()))?; .ok_or_else(|| ZclawError::InvalidInput("Missing 'command' parameter".into()))?;

View File

@@ -11,7 +11,7 @@ use zclaw_memory::MemoryStore;
use crate::driver::LlmDriver; use crate::driver::LlmDriver;
use crate::loop_runner::{AgentLoop, LoopEvent}; use crate::loop_runner::{AgentLoop, LoopEvent};
use crate::tool::{Tool, ToolContext, ToolRegistry}; use crate::tool::{Tool, ToolContext, ToolRegistry, ToolConcurrency};
use crate::tool::builtin::register_builtin_tools; use crate::tool::builtin::register_builtin_tools;
use std::sync::Arc; use std::sync::Arc;
@@ -91,6 +91,10 @@ impl Tool for TaskTool {
}) })
} }
fn concurrency(&self) -> ToolConcurrency {
ToolConcurrency::Exclusive
}
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> { async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> {
let description = input["description"].as_str() let description = input["description"].as_str()
.ok_or_else(|| ZclawError::InvalidInput("Missing 'description' parameter".into()))?; .ok_or_else(|| ZclawError::InvalidInput("Missing 'description' parameter".into()))?;

View File

@@ -7,7 +7,7 @@ use async_trait::async_trait;
use serde_json::{json, Value}; use serde_json::{json, Value};
use zclaw_types::Result; use zclaw_types::Result;
use crate::tool::{Tool, ToolContext}; use crate::tool::{Tool, ToolContext, ToolConcurrency};
/// Wrapper that exposes a Hand as a Tool in the agent's tool registry. /// Wrapper that exposes a Hand as a Tool in the agent's tool registry.
/// ///
@@ -78,6 +78,10 @@ impl Tool for HandTool {
self.input_schema.clone() self.input_schema.clone()
} }
fn concurrency(&self) -> ToolConcurrency {
ToolConcurrency::Exclusive
}
async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> { async fn execute(&self, input: Value, context: &ToolContext) -> Result<Value> {
// Delegate to the HandExecutor (bridged from HandRegistry via kernel). // Delegate to the HandExecutor (bridged from HandRegistry via kernel).
// If no hand_executor is available (e.g., standalone runtime without kernel), // If no hand_executor is available (e.g., standalone runtime without kernel),

View File

@@ -223,6 +223,33 @@ impl Serialize for ZclawError {
/// Result type alias for ZCLAW operations /// Result type alias for ZCLAW operations
pub type Result<T> = std::result::Result<T, ZclawError>; pub type Result<T> = std::result::Result<T, ZclawError>;
/// LLM 调用错误的细粒度分类,指导重试和恢复策略
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum LlmErrorKind {
Auth,
AuthPermanent,
BillingExhausted,
RateLimited,
Overloaded,
ServerError,
Timeout,
ContextOverflow,
ModelNotFound,
Unknown,
}
/// 分类后的 LLM 错误,附带恢复提示
#[derive(Debug, Clone)]
pub struct ClassifiedLlmError {
pub kind: LlmErrorKind,
pub retryable: bool,
pub should_compress: bool,
pub should_rotate_credential: bool,
pub retry_after: Option<std::time::Duration>,
pub message: String,
}
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use super::*; use super::*;

View File

@@ -16,6 +16,21 @@ use zclaw_types::Result;
use super::pain_aggregator::PainPoint; use super::pain_aggregator::PainPoint;
use super::solution_generator::Proposal; use super::solution_generator::Proposal;
/// Brief summary of a stored experience, for suggestion context enrichment.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ExperienceBrief {
pub pain_pattern: String,
pub solution_summary: String,
pub reuse_count: u32,
}
static EXPERIENCE_EXTRACTOR: std::sync::OnceLock<std::sync::Arc<ExperienceExtractor>> = std::sync::OnceLock::new();
/// Get the global ExperienceExtractor singleton (if initialized).
pub(crate) fn get_experience_extractor() -> Option<std::sync::Arc<ExperienceExtractor>> {
EXPERIENCE_EXTRACTOR.get().cloned()
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Shared completion status // Shared completion status
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -263,6 +278,36 @@ fn xml_escape(s: &str) -> String {
.replace('>', "&gt;") .replace('>', "&gt;")
} }
/// Initialize the global ExperienceExtractor singleton.
/// Called once during app startup, after viking storage is ready.
pub async fn init_experience_extractor() -> Result<()> {
let sqlite_storage = crate::viking_commands::get_storage().await
.map_err(|e| zclaw_types::ZclawError::StorageError(e))?;
let viking = std::sync::Arc::new(zclaw_growth::VikingAdapter::new(sqlite_storage));
let store = std::sync::Arc::new(ExperienceStore::new(viking));
let extractor = std::sync::Arc::new(ExperienceExtractor::new(store));
EXPERIENCE_EXTRACTOR.set(extractor)
.map_err(|_| zclaw_types::ZclawError::StorageError("ExperienceExtractor already initialized".into()))?;
Ok(())
}
/// Find experiences relevant to the current conversation for suggestion enrichment.
#[tauri::command]
pub async fn experience_find_relevant(
agent_id: String,
query: String,
) -> std::result::Result<Vec<ExperienceBrief>, String> {
let extractor = get_experience_extractor()
.ok_or("ExperienceExtractor not initialized".to_string())?;
let experiences = extractor.find_relevant_experiences(&agent_id, &query).await;
Ok(experiences.into_iter().take(3).map(|e| ExperienceBrief {
pain_pattern: e.pain_pattern,
solution_summary: e.solution_steps.join("")
.chars().take(100).collect(),
reuse_count: e.reuse_count,
}).collect())
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Tests // Tests
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -407,4 +452,17 @@ mod tests {
assert_eq!(truncate("hello", 10), "hello"); assert_eq!(truncate("hello", 10), "hello");
assert_eq!(truncate("这是一个很长的字符串用于测试截断", 10).chars().count(), 11); // 10 + … assert_eq!(truncate("这是一个很长的字符串用于测试截断", 10).chars().count(), 11); // 10 + …
} }
#[test]
fn test_experience_brief_serialization() {
let brief = super::ExperienceBrief {
pain_pattern: "报表生成慢".to_string(),
solution_summary: "使用 researcher 技能自动收集".to_string(),
reuse_count: 3,
};
let json = serde_json::to_string(&brief).unwrap();
let parsed: super::ExperienceBrief = serde_json::from_str(&json).unwrap();
assert_eq!(parsed.pain_pattern, "报表生成慢");
assert_eq!(parsed.reuse_count, 3);
}
} }

View File

@@ -7,8 +7,10 @@
use tracing::{debug, warn}; use tracing::{debug, warn};
use std::collections::HashMap;
use std::sync::Arc; use std::sync::Arc;
use tauri::Emitter; use tauri::Emitter;
use tokio::sync::RwLock;
use zclaw_growth::VikingStorage; use zclaw_growth::VikingStorage;
use crate::intelligence::identity::IdentityManagerState; use crate::intelligence::identity::IdentityManagerState;
@@ -16,6 +18,36 @@ use crate::intelligence::heartbeat::HeartbeatEngineState;
use crate::intelligence::reflection::{MemoryEntryForAnalysis, ReflectionEngineState}; use crate::intelligence::reflection::{MemoryEntryForAnalysis, ReflectionEngineState};
use zclaw_runtime::driver::LlmDriver; use zclaw_runtime::driver::LlmDriver;
// ---------------------------------------------------------------------------
// Identity prompt cache — avoids mutex + disk I/O on every request
// ---------------------------------------------------------------------------
struct CachedIdentity {
prompt: String,
#[allow(dead_code)] // Reserved for future TTL-based cache validation
soul_hash: u64,
}
static IDENTITY_CACHE: std::sync::LazyLock<RwLock<HashMap<String, CachedIdentity>>> =
std::sync::LazyLock::new(|| RwLock::new(HashMap::new()));
/// Invalidate cached identity prompt for a given agent (call when soul.md changes).
pub fn invalidate_identity_cache(agent_id: &str) {
let cache = &*IDENTITY_CACHE;
// Non-blocking: spawn a task to remove the entry
if let Ok(mut guard) = cache.try_write() {
guard.remove(agent_id);
}
}
/// Simple hash for cache invalidation — uses string content hash.
fn content_hash(s: &str) -> u64 {
use std::hash::{Hash, Hasher};
let mut hasher = std::collections::hash_map::DefaultHasher::new();
s.hash(&mut hasher);
hasher.finish()
}
/// Run pre-conversation intelligence hooks /// Run pre-conversation intelligence hooks
/// ///
/// Builds identity-enhanced system prompt (SOUL.md + instructions) and /// Builds identity-enhanced system prompt (SOUL.md + instructions) and
@@ -29,10 +61,29 @@ pub async fn pre_conversation_hook(
_user_message: &str, _user_message: &str,
identity_state: &IdentityManagerState, identity_state: &IdentityManagerState,
) -> Result<String, String> { ) -> Result<String, String> {
// Build identity-enhanced system prompt (SOUL.md + instructions) // Check identity prompt cache first (avoids mutex + disk I/O)
// Memory context is injected by MemoryMiddleware in the kernel middleware chain, let cache = &*IDENTITY_CACHE;
// not here, to avoid duplicate injection. {
let enhanced_prompt = match build_identity_prompt(agent_id, "", identity_state).await { let guard = cache.read().await;
if let Some(cached) = guard.get(agent_id) {
// Cache hit — still need continuity context, but skip identity build
let continuity_context = build_continuity_context(agent_id, _user_message).await;
let mut result = cached.prompt.clone();
if !continuity_context.is_empty() {
result.push_str(&continuity_context);
}
debug!("[intelligence_hooks] Identity cache HIT for agent {}", agent_id);
return Ok(result);
}
}
// Cache miss — build identity prompt and continuity context in parallel
let (identity_result, continuity_context) = tokio::join!(
build_identity_prompt_cached(agent_id, "", identity_state, cache),
build_continuity_context(agent_id, _user_message)
);
let enhanced_prompt = match identity_result {
Ok(prompt) => prompt, Ok(prompt) => prompt,
Err(e) => { Err(e) => {
warn!( warn!(
@@ -43,9 +94,6 @@ pub async fn pre_conversation_hook(
} }
}; };
// Cross-session continuity: check for unresolved pain points and recent experiences
let continuity_context = build_continuity_context(agent_id, _user_message).await;
let mut result = enhanced_prompt; let mut result = enhanced_prompt;
if !continuity_context.is_empty() { if !continuity_context.is_empty() {
result.push_str(&continuity_context); result.push_str(&continuity_context);
@@ -240,6 +288,8 @@ pub async fn post_conversation_hook(
warn!("[intelligence_hooks] Failed to update soul with agent name: {}", e); warn!("[intelligence_hooks] Failed to update soul with agent name: {}", e);
} else { } else {
debug!("[intelligence_hooks] Updated agent name to '{}' in soul", name); debug!("[intelligence_hooks] Updated agent name to '{}' in soul", name);
// Invalidate cache since soul.md changed
invalidate_identity_cache(agent_id);
} }
} }
drop(manager); drop(manager);
@@ -340,21 +390,34 @@ async fn build_memory_context(
Ok(context) Ok(context)
} }
/// Build identity-enhanced system prompt /// Build identity-enhanced system prompt and cache the result.
async fn build_identity_prompt( async fn build_identity_prompt_cached(
agent_id: &str, agent_id: &str,
memory_context: &str, memory_context: &str,
identity_state: &IdentityManagerState, identity_state: &IdentityManagerState,
cache: &RwLock<HashMap<String, CachedIdentity>>,
) -> Result<String, String> { ) -> Result<String, String> {
// IdentityManagerState is Arc<tokio::sync::Mutex<AgentIdentityManager>>
// tokio::sync::Mutex::lock() returns MutexGuard directly
let mut manager = identity_state.lock().await; let mut manager = identity_state.lock().await;
// Read current soul content for hashing
let soul_content = manager.get_file(agent_id, crate::intelligence::identity::IdentityFile::Soul);
let soul_hash = content_hash(&soul_content);
let prompt = manager.build_system_prompt( let prompt = manager.build_system_prompt(
agent_id, agent_id,
if memory_context.is_empty() { None } else { Some(memory_context) }, if memory_context.is_empty() { None } else { Some(memory_context) },
).await; ).await;
// Cache the result
drop(manager); // Release lock before acquiring write guard
{
let mut guard = cache.write().await;
guard.insert(agent_id.to_string(), CachedIdentity {
prompt: prompt.clone(),
soul_hash,
});
}
Ok(prompt) Ok(prompt)
} }

View File

@@ -212,6 +212,12 @@ pub fn run() {
if let Err(e) = rt.block_on(intelligence::pain_aggregator::init_pain_storage(pool)) { if let Err(e) = rt.block_on(intelligence::pain_aggregator::init_pain_storage(pool)) {
tracing::error!("[PainStorage] Init failed: {}, pain points will not persist", e); tracing::error!("[PainStorage] Init failed: {}, pain points will not persist", e);
} }
// Initialize experience extractor for suggestion enrichment.
// Graceful degradation: failure does not block app startup.
if let Err(e) = rt.block_on(intelligence::experience::init_experience_extractor()) {
tracing::warn!("[ExperienceExtractor] Init failed: {}, suggestion context will be empty", e);
}
} }
Ok(()) Ok(())
@@ -435,6 +441,8 @@ pub fn run() {
intelligence::pain_aggregator::butler_update_proposal_status, intelligence::pain_aggregator::butler_update_proposal_status,
// Industry config loader // Industry config loader
viking_commands::viking_load_industry_keywords, viking_commands::viking_load_industry_keywords,
// Experience finder for suggestion enrichment
intelligence::experience::experience_find_relevant,
]) ])
.run(tauri::generate_context!()) .run(tauri::generate_context!())
.expect("error while running tauri application"); .expect("error while running tauri application");

View File

@@ -665,6 +665,28 @@ function stripToolNarration(content: string): string {
return result || content; return result || content;
} }
/**
* Strip dangling clarification references from text when ask_clarification tool was called.
* When the LLM calls ask_clarification, it often ends its text with phrases like
* "比如:" / "以下信息" / "以下选项" that reference the tool output — but the tool output
* is rendered in a separate ClarificationCard, so these become confusing dead-end sentences.
*/
function stripDanglingClarificationRef(text: string, hasClarificationTool: boolean): string {
if (!hasClarificationTool || !text) return text;
// Match trailing dangling references in Chinese and English
const patterns = [
/[,]\s*可以(?:提供以下|告诉我更多细节,)?(?:信息|选项|方向|细节|分类|类型)[:]\s*$/,
/[,]\s*比如[:]\s*$/,
/[,]\s*(?:例如|譬如|如以下)[:]\s*$/,
/,\s*(?:for example|such as|like|the following)[:]?\s*$/i,
];
for (const pat of patterns) {
const stripped = text.replace(pat, '');
if (stripped !== text) return stripped;
}
return text;
}
function MessageBubble({ message, onRetry }: { message: Message; setInput?: (text: string) => void; onRetry?: () => void }) { function MessageBubble({ message, onRetry }: { message: Message; setInput?: (text: string) => void; onRetry?: () => void }) {
if (message.role === 'tool') { if (message.role === 'tool') {
return null; return null;
@@ -749,7 +771,10 @@ function MessageBubble({ message, onRetry }: { message: Message; setInput?: (tex
? (isUser ? (isUser
? message.content ? message.content
: <StreamingText : <StreamingText
content={stripToolNarration(message.content)} content={stripDanglingClarificationRef(
stripToolNarration(message.content),
toolCallSteps?.some(s => s.toolName === 'ask_clarification') ?? false,
)}
isStreaming={!!message.streaming} isStreaming={!!message.streaming}
className="text-gray-700 dark:text-gray-200" className="text-gray-700 dark:text-gray-200"
/> />

View File

@@ -6,9 +6,10 @@ import {
Image as ImageIcon, Image as ImageIcon,
Download, Download,
Copy, Copy,
ChevronLeft, ChevronDown,
File, File,
} from 'lucide-react'; } from 'lucide-react';
import { MarkdownRenderer } from './MarkdownRenderer';
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Types // Types
@@ -76,6 +77,7 @@ export function ArtifactPanel({
className = '', className = '',
}: ArtifactPanelProps) { }: ArtifactPanelProps) {
const [viewMode, setViewMode] = useState<'preview' | 'code'>('preview'); const [viewMode, setViewMode] = useState<'preview' | 'code'>('preview');
const [fileMenuOpen, setFileMenuOpen] = useState(false);
const selected = useMemo( const selected = useMemo(
() => artifacts.find((a) => a.id === selectedId), () => artifacts.find((a) => a.id === selectedId),
[artifacts, selectedId] [artifacts, selectedId]
@@ -135,22 +137,59 @@ export function ArtifactPanel({
return ( return (
<div className={`h-full flex flex-col ${className}`}> <div className={`h-full flex flex-col ${className}`}>
{/* File header */} {/* File header with inline file selector */}
<div className="px-4 py-2 border-b border-gray-200 dark:border-gray-700 flex items-center gap-2 flex-shrink-0"> <div className="px-4 py-2 border-b border-gray-200 dark:border-gray-700 flex items-center gap-2 flex-shrink-0">
<div className="relative">
<button <button
onClick={() => onSelect('')} onClick={() => setFileMenuOpen(!fileMenuOpen)}
className="p-1 rounded hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-400 hover:text-gray-600 dark:hover:text-gray-200 transition-colors" className="flex items-center gap-1.5 text-sm font-medium text-gray-700 dark:text-gray-200 truncate hover:text-orange-500 transition-colors"
title="返回文件列表" title="切换文件"
> >
<ChevronLeft className="w-4 h-4" />
</button>
<Icon className="w-4 h-4 text-orange-500 flex-shrink-0" /> <Icon className="w-4 h-4 text-orange-500 flex-shrink-0" />
<span className="text-sm font-medium text-gray-700 dark:text-gray-200 truncate flex-1"> <span className="truncate max-w-[120px]">{selected.name}</span>
{selected.name} {artifacts.length > 1 && (
<ChevronDown className={`w-3.5 h-3.5 text-gray-400 transition-transform ${fileMenuOpen ? 'rotate-180' : ''}`} />
)}
</button>
{/* File selector dropdown */}
{fileMenuOpen && artifacts.length > 1 && (
<>
<div className="fixed inset-0 z-10" onClick={() => setFileMenuOpen(false)} />
<div className="absolute top-full left-0 mt-1 w-56 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg z-20 py-1 max-h-60 overflow-y-auto">
{artifacts.map((artifact) => {
const ItemIcon = getFileIcon(artifact.type);
return (
<button
key={artifact.id}
onClick={() => { onSelect(artifact.id); setFileMenuOpen(false); }}
className={`w-full flex items-center gap-2 px-3 py-2 text-left text-sm hover:bg-gray-50 dark:hover:bg-gray-700 transition-colors ${
artifact.id === selected.id ? 'bg-orange-50 dark:bg-orange-900/20 text-orange-700 dark:text-orange-300' : 'text-gray-700 dark:text-gray-200'
}`}
>
<ItemIcon className="w-4 h-4 flex-shrink-0" />
<span className="truncate flex-1">{artifact.name}</span>
<span className={`text-[10px] px-1 py-0.5 rounded ${getTypeColor(artifact.type)}`}>
{getTypeLabel(artifact.type)}
</span> </span>
</button>
);
})}
</div>
</>
)}
</div>
<div className="flex-1" />
<span className={`text-[10px] px-1.5 py-0.5 rounded font-medium ${getTypeColor(selected.type)}`}> <span className={`text-[10px] px-1.5 py-0.5 rounded font-medium ${getTypeColor(selected.type)}`}>
{getTypeLabel(selected.type)} {getTypeLabel(selected.type)}
</span> </span>
{selected.language && (
<span className="text-[10px] text-gray-400 dark:text-gray-500">
{selected.language}
</span>
)}
</div> </div>
{/* View mode toggle */} {/* View mode toggle */}
@@ -180,19 +219,7 @@ export function ArtifactPanel({
{/* Content area */} {/* Content area */}
<div className="flex-1 overflow-y-auto custom-scrollbar p-4"> <div className="flex-1 overflow-y-auto custom-scrollbar p-4">
{viewMode === 'preview' ? ( {viewMode === 'preview' ? (
<div className="prose prose-sm dark:prose-invert max-w-none"> <ArtifactContentPreview artifact={selected} />
{selected.type === 'markdown' ? (
<MarkdownPreview content={selected.content} />
) : selected.type === 'code' ? (
<pre className="bg-gray-50 dark:bg-gray-800 rounded-lg p-3 text-xs font-mono overflow-x-auto text-gray-700 dark:text-gray-200">
{selected.content}
</pre>
) : (
<pre className="whitespace-pre-wrap text-sm text-gray-700 dark:text-gray-200">
{selected.content}
</pre>
)}
</div>
) : ( ) : (
<pre className="bg-gray-50 dark:bg-gray-800 rounded-lg p-3 text-xs font-mono overflow-x-auto text-gray-700 dark:text-gray-200 leading-relaxed"> <pre className="bg-gray-50 dark:bg-gray-800 rounded-lg p-3 text-xs font-mono overflow-x-auto text-gray-700 dark:text-gray-200 leading-relaxed">
{selected.content} {selected.content}
@@ -217,6 +244,37 @@ export function ArtifactPanel({
); );
} }
// ---------------------------------------------------------------------------
// ArtifactContentPreview — renders artifact based on type
// ---------------------------------------------------------------------------
function ArtifactContentPreview({ artifact }: { artifact: ArtifactFile }) {
if (artifact.type === 'markdown') {
return <MarkdownRenderer content={artifact.content} />;
}
if (artifact.type === 'code') {
return (
<div className="relative">
{artifact.language && (
<div className="absolute top-2 right-2 text-[10px] text-gray-400 dark:text-gray-500 bg-gray-100 dark:bg-gray-700 px-1.5 py-0.5 rounded">
{artifact.language}
</div>
)}
<pre className="bg-gray-50 dark:bg-gray-900 rounded-lg p-4 text-xs font-mono overflow-x-auto text-gray-700 dark:text-gray-200 leading-relaxed border border-gray-200 dark:border-gray-700">
{artifact.content}
</pre>
</div>
);
}
return (
<pre className="whitespace-pre-wrap text-sm text-gray-700 dark:text-gray-200">
{artifact.content}
</pre>
);
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// ActionButton // ActionButton
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -243,50 +301,6 @@ function ActionButton({ icon, label, onClick }: { icon: React.ReactNode; label:
); );
} }
// ---------------------------------------------------------------------------
// Simple Markdown preview (no external deps)
// ---------------------------------------------------------------------------
function MarkdownPreview({ content }: { content: string }) {
// Basic markdown rendering: headings, bold, code blocks, lists
const lines = content.split('\n');
return (
<div className="space-y-2">
{lines.map((line, i) => {
// Heading
if (line.startsWith('### ')) {
return <h3 key={i} className="text-sm font-bold text-gray-800 dark:text-gray-100 mt-3">{line.slice(4)}</h3>;
}
if (line.startsWith('## ')) {
return <h2 key={i} className="text-base font-bold text-gray-800 dark:text-gray-100 mt-4">{line.slice(3)}</h2>;
}
if (line.startsWith('# ')) {
return <h1 key={i} className="text-lg font-bold text-gray-800 dark:text-gray-100">{line.slice(2)}</h1>;
}
// Code block (simplified)
if (line.startsWith('```')) return null;
// List item
if (line.startsWith('- ') || line.startsWith('* ')) {
return <li key={i} className="text-sm text-gray-700 dark:text-gray-300 ml-4">{renderInline(line.slice(2))}</li>;
}
// Empty line
if (!line.trim()) return <div key={i} className="h-2" />;
// Regular paragraph
return <p key={i} className="text-sm text-gray-700 dark:text-gray-300 leading-relaxed">{renderInline(line)}</p>;
})}
</div>
);
}
function renderInline(text: string): React.ReactNode {
// Bold
const parts = text.split(/\*\*(.*?)\*\*/g);
return parts.map((part, i) =>
i % 2 === 1 ? <strong key={i} className="font-semibold">{part}</strong> : part
);
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Download helper // Download helper
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------

View File

@@ -0,0 +1,123 @@
/**
* MarkdownRenderer — shared Markdown rendering with styled components.
*
* Extracted from StreamingText.tsx so ArtifactPanel and other consumers
* can reuse the same rich rendering (GFM tables, syntax blocks, etc.)
* without duplicating the component overrides.
*/
import ReactMarkdown from 'react-markdown';
import remarkGfm from 'remark-gfm';
import type { Components } from 'react-markdown';
// ---------------------------------------------------------------------------
// Shared component overrides for react-markdown
// ---------------------------------------------------------------------------
export const markdownComponents: Components = {
pre({ children }) {
return (
<pre className="bg-gray-50 dark:bg-gray-900 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed border border-gray-200 dark:border-gray-700 my-3">
{children}
</pre>
);
},
code({ className, children, ...props }) {
const isBlock = className?.startsWith('language-');
if (isBlock) {
return (
<code className={`${className || ''} text-gray-800 dark:text-gray-200`} {...props}>
{children}
</code>
);
}
return (
<code className="bg-gray-100 dark:bg-gray-800 text-gray-700 dark:text-gray-300 px-1.5 py-0.5 rounded text-[0.9em] font-mono" {...props}>
{children}
</code>
);
},
table({ children }) {
return (
<div className="overflow-x-auto my-3 -mx-1">
<table className="min-w-full border-collapse border border-gray-200 dark:border-gray-700 rounded-lg text-sm">
{children}
</table>
</div>
);
},
thead({ children }) {
return <thead className="bg-gray-50 dark:bg-gray-800/50">{children}</thead>;
},
th({ children }) {
return (
<th className="border border-gray-200 dark:border-gray-700 px-3 py-2 text-left font-semibold text-gray-700 dark:text-gray-300">
{children}
</th>
);
},
td({ children }) {
return (
<td className="border border-gray-200 dark:border-gray-700 px-3 py-2 text-gray-600 dark:text-gray-400">
{children}
</td>
);
},
ul({ children }) {
return <ul className="list-disc list-outside ml-5 my-2 space-y-1">{children}</ul>;
},
ol({ children }) {
return <ol className="list-decimal list-outside ml-5 my-2 space-y-1">{children}</ol>;
},
li({ children }) {
return <li className="leading-relaxed">{children}</li>;
},
h1({ children }) {
return <h1 className="text-xl font-bold mt-5 mb-3 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h1>;
},
h2({ children }) {
return <h2 className="text-lg font-bold mt-4 mb-2 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h2>;
},
h3({ children }) {
return <h3 className="text-base font-semibold mt-3 mb-2 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h3>;
},
blockquote({ children }) {
return (
<blockquote className="border-l-4 border-gray-300 dark:border-gray-600 pl-4 py-1 my-3 text-gray-600 dark:text-gray-400 italic bg-gray-50 dark:bg-gray-800/30 rounded-r-lg">
{children}
</blockquote>
);
},
p({ children }) {
return <p className="my-2 leading-relaxed first:mt-0 last:mb-0">{children}</p>;
},
a({ href, children }) {
return (
<a href={href} target="_blank" rel="noopener noreferrer" className="text-blue-600 dark:text-blue-400 underline hover:text-blue-800 dark:hover:text-blue-300">
{children}
</a>
);
},
hr() {
return <hr className="my-4 border-gray-200 dark:border-gray-700" />;
},
};
// ---------------------------------------------------------------------------
// Convenience wrapper
// ---------------------------------------------------------------------------
interface MarkdownRendererProps {
content: string;
className?: string;
}
export function MarkdownRenderer({ content, className = '' }: MarkdownRendererProps) {
return (
<div className={`prose-sm prose-gray dark:prose-invert max-w-none ${className}`}>
<ReactMarkdown remarkPlugins={[remarkGfm]} components={markdownComponents}>
{content}
</ReactMarkdown>
</div>
);
}

View File

@@ -1,7 +1,5 @@
import { useMemo, useRef, useEffect, useState } from 'react'; import { useMemo, useRef, useEffect, useState } from 'react';
import ReactMarkdown from 'react-markdown'; import { MarkdownRenderer } from './MarkdownRenderer';
import remarkGfm from 'remark-gfm';
import type { Components } from 'react-markdown';
/** /**
* Streaming text with word-by-word reveal animation. * Streaming text with word-by-word reveal animation.
@@ -18,111 +16,6 @@ interface StreamingTextProps {
asMarkdown?: boolean; asMarkdown?: boolean;
} }
// ---------------------------------------------------------------------------
// Markdown component overrides for rich rendering
// ---------------------------------------------------------------------------
const markdownComponents: Components = {
// Code blocks (```...```)
pre({ children }) {
return (
<pre className="bg-gray-50 dark:bg-gray-900 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed border border-gray-200 dark:border-gray-700 my-3">
{children}
</pre>
);
},
// Inline code (`...`)
code({ className, children, ...props }) {
// If it has a language class, it's inside a code block — render as block
const isBlock = className?.startsWith('language-');
if (isBlock) {
return (
<code className={`${className || ''} text-gray-800 dark:text-gray-200`} {...props}>
{children}
</code>
);
}
return (
<code className="bg-gray-100 dark:bg-gray-800 text-gray-700 dark:text-gray-300 px-1.5 py-0.5 rounded text-[0.9em] font-mono" {...props}>
{children}
</code>
);
},
// Tables
table({ children }) {
return (
<div className="overflow-x-auto my-3 -mx-1">
<table className="min-w-full border-collapse border border-gray-200 dark:border-gray-700 rounded-lg text-sm">
{children}
</table>
</div>
);
},
thead({ children }) {
return <thead className="bg-gray-50 dark:bg-gray-800/50">{children}</thead>;
},
th({ children }) {
return (
<th className="border border-gray-200 dark:border-gray-700 px-3 py-2 text-left font-semibold text-gray-700 dark:text-gray-300">
{children}
</th>
);
},
td({ children }) {
return (
<td className="border border-gray-200 dark:border-gray-700 px-3 py-2 text-gray-600 dark:text-gray-400">
{children}
</td>
);
},
// Unordered lists
ul({ children }) {
return <ul className="list-disc list-outside ml-5 my-2 space-y-1">{children}</ul>;
},
// Ordered lists
ol({ children }) {
return <ol className="list-decimal list-outside ml-5 my-2 space-y-1">{children}</ol>;
},
// List items
li({ children }) {
return <li className="leading-relaxed">{children}</li>;
},
// Headings
h1({ children }) {
return <h1 className="text-xl font-bold mt-5 mb-3 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h1>;
},
h2({ children }) {
return <h2 className="text-lg font-bold mt-4 mb-2 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h2>;
},
h3({ children }) {
return <h3 className="text-base font-semibold mt-3 mb-2 text-gray-900 dark:text-gray-100 first:mt-0">{children}</h3>;
},
// Blockquotes
blockquote({ children }) {
return (
<blockquote className="border-l-4 border-gray-300 dark:border-gray-600 pl-4 py-1 my-3 text-gray-600 dark:text-gray-400 italic bg-gray-50 dark:bg-gray-800/30 rounded-r-lg">
{children}
</blockquote>
);
},
// Paragraphs
p({ children }) {
return <p className="my-2 leading-relaxed first:mt-0 last:mb-0">{children}</p>;
},
// Links
a({ href, children }) {
return (
<a href={href} target="_blank" rel="noopener noreferrer" className="text-blue-600 dark:text-blue-400 underline hover:text-blue-800 dark:hover:text-blue-300">
{children}
</a>
);
},
// Horizontal rules
hr() {
return <hr className="my-4 border-gray-200 dark:border-gray-700" />;
},
};
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Token splitter for streaming animation // Token splitter for streaming animation
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -176,13 +69,7 @@ export function StreamingText({
}: StreamingTextProps) { }: StreamingTextProps) {
// For completed messages, use full markdown rendering with styled components // For completed messages, use full markdown rendering with styled components
if (!isStreaming && asMarkdown) { if (!isStreaming && asMarkdown) {
return ( return <MarkdownRenderer content={content} className={className} />;
<div className={`prose-sm prose-gray dark:prose-invert max-w-none ${className}`}>
<ReactMarkdown remarkPlugins={[remarkGfm]} components={markdownComponents}>
{content}
</ReactMarkdown>
</div>
);
} }
// For streaming messages, use token-by-token animation // For streaming messages, use token-by-token animation

View File

@@ -166,7 +166,8 @@ interface ToolStepRowProps {
} }
function ToolStepRow({ step, isActive, showConnector }: ToolStepRowProps) { function ToolStepRow({ step, isActive, showConnector }: ToolStepRowProps) {
const [expanded, setExpanded] = useState(false); // Clarification cards default to expanded so users see options immediately
const [expanded, setExpanded] = useState(step.toolName === 'ask_clarification');
const Icon = getToolIcon(step.toolName); const Icon = getToolIcon(step.toolName);
const label = getToolLabel(step.toolName); const label = getToolLabel(step.toolName);
const isRunning = step.status === 'running'; const isRunning = step.status === 'running';

View File

@@ -8,4 +8,5 @@ export { SuggestionChips } from './SuggestionChips';
export { ResizableChatLayout } from './ResizableChatLayout'; export { ResizableChatLayout } from './ResizableChatLayout';
export { ToolCallChain, type ToolCallStep } from './ToolCallChain'; export { ToolCallChain, type ToolCallStep } from './ToolCallChain';
export { ArtifactPanel, type ArtifactFile } from './ArtifactPanel'; export { ArtifactPanel, type ArtifactFile } from './ArtifactPanel';
export { MarkdownRenderer, markdownComponents } from './MarkdownRenderer';
export { TokenMeter } from './TokenMeter'; export { TokenMeter } from './TokenMeter';

View File

@@ -696,13 +696,14 @@ export class GatewayClient {
break; break;
case 'tool_call': case 'tool_call':
// Tool call event // Tool call start: onTool(name, input, '') — empty output signals start
if (callbacks.onTool && data.tool) { if (callbacks.onTool && data.tool) {
callbacks.onTool(data.tool, JSON.stringify(data.input || {}), data.output || ''); callbacks.onTool(data.tool, JSON.stringify(data.input || {}), '');
} }
break; break;
case 'tool_result': case 'tool_result':
// Tool call end: onTool(name, '', output) — empty input signals end
if (callbacks.onTool && data.tool) { if (callbacks.onTool && data.tool) {
callbacks.onTool(data.tool, '', String(data.result || data.output || '')); callbacks.onTool(data.tool, '', String(data.result || data.output || ''));
} }

View File

@@ -646,18 +646,25 @@ const HARDCODED_PROMPTS: Record<string, { system: string; user: (arg: string) =>
}, },
suggestions: { suggestions: {
system: `你是对话分析助手。根据最近的对话内容,生成 3 个用户可能想继续探讨的问题 system: `你是 ZCLAW 的管家助手,需要站在用户角度思考他们真正需要什么,生成 3 个个性化建议
要求: ## 生成规则
- 每个问题必须与对话内容直接相关,具体且有针对性 1. 第 1 条 — 深入追问:基于当前话题,提出一个有洞察力的追问,帮助用户深入探索
- 帮助用户深入理解、实际操作或拓展思路 2. 第 2 条 — 实用行动:建议一个具体的、可操作的下一步(调用技能、执行工具、查看数据等)
- 每个问题不超过 30 个中文字符 3. 第 3 条 — 管家关怀:
- 不要重复对话中已讨论过的内容 - 如果有未解决痛点 → 回访建议,如"上次提到的X后来解决了吗"
- 使用与用户相同的语言 - 如果有相关经验 → 引导复用,如"上次用X方法解决了类似问题要再试试吗"
- 如果有匹配技能 → 推荐使用,如"试试 [技能名] 来处理这个"
- 如果没有提供痛点/经验/技能信息 → 给出一个启发性的思考角度
4. 每个不超过 30 个中文字符
5. 不要重复对话中已讨论过的内容
6. 不要生成空泛的建议(如"继续分析"、"换个角度"
7. 默认使用中文,不要混入英文词汇(如"workflow"用"工作流"、"report"用"报表"),除非用户在对话中明确使用英文
8. 建议会被用户直接点击发送,因此不要包含任何称谓(如"领导"、"老板"、"老师"等),用无主语的问句或陈述句
只输出 JSON 数组,包含恰好 3 个字符串。不要输出任何其他内容。 只输出 JSON 数组,包含恰好 3 个字符串。不要输出任何其他内容。
示例:["如何在生产环境中部署?", "这个方案的成本如何?", "有没有更简单的替代方案?"]`, 示例:["科室绩效分析可以按哪些维度拆解?", "用研究技能查一下相关文献?", "上次提到的排班冲突问题,需要继续想解决方案"]`,
user: (context: string) => `以下是对话中最近的消息:\n\n${context}\n\n请生成 3 个后续问题`, user: (context: string) => `以下是对话中最近的消息:\n\n${context}\n\n请生成 3 个后续建议1 深入追问 + 1 实用行动 + 1 管家关怀)`,
}, },
}; };

View File

@@ -0,0 +1,131 @@
/**
* Suggestion context enrichment — fetches intelligence data for personalized suggestions.
* All fetches are optional; failures silently degrade to empty context.
*/
import { invoke } from '@tauri-apps/api/core';
import { createLogger } from './logger';
const log = createLogger('SuggestionContext');
const CONTEXT_FETCH_TIMEOUT = 500;
/** Pain point from butler intelligence layer. */
interface PainPoint {
summary: string;
category: string;
confidence: number;
status: string;
occurrence_count: number;
}
/** Brief experience from the experience store. */
interface ExperienceBrief {
pain_pattern: string;
solution_summary: string;
reuse_count: number;
}
/** Pipeline/skill match candidate. */
interface PipelineCandidateInfo {
id: string;
display_name: string;
description: string;
category: string | null;
match_reason: string | null;
}
/** Route intent response (only NoMatch variant has suggestions). */
interface RouteResultResponse {
type: 'Matched' | 'Ambiguous' | 'NoMatch' | 'NeedMoreInfo';
suggestions?: PipelineCandidateInfo[];
}
/** Aggregated suggestion context from all intelligence sources. */
export interface SuggestionContext {
userProfile: string;
painPoints: string;
experiences: string;
skillMatch: string;
}
function isTauriAvailable(): boolean {
return typeof window !== 'undefined' && '__TAURI_INTERNALS__' in window;
}
function withTimeout<T>(promise: Promise<T>, ms: number): Promise<T | null> {
return Promise.race([
promise,
new Promise<null>(resolve => setTimeout(() => resolve(null), ms)),
]);
}
async function fetchUserProfile(agentId: string): Promise<string> {
const profile = await invoke<string>('identity_get_file', {
agentId,
file: 'userprofile',
});
if (!profile || profile.trim().length === 0) return '';
const text = profile.trim();
return text.length > 200 ? text.slice(0, 200) : text;
}
async function fetchPainPoints(agentId: string): Promise<string> {
const points = await invoke<PainPoint[]>('butler_list_pain_points', { agentId });
if (!Array.isArray(points) || points.length === 0) return '';
const active = points
.filter(p => p.confidence >= 0.5 && p.status !== 'Solved' && p.status !== 'Dismissed')
.sort((a, b) => b.confidence - a.confidence)
.slice(0, 3);
if (active.length === 0) return '';
return active
.map((p, i) => `${i + 1}. [${p.category}] ${p.summary}(出现${p.occurrence_count}次)`)
.join('\n');
}
async function fetchExperiences(agentId: string, query: string): Promise<string> {
const experiences = await invoke<ExperienceBrief[]>('experience_find_relevant', {
agentId,
query,
});
if (!Array.isArray(experiences) || experiences.length === 0) return '';
return experiences.slice(0, 2)
.map(e => `上次解决"${e.pain_pattern}"的方法:${e.solution_summary}(已复用${e.reuse_count}次)`)
.join('\n');
}
async function fetchSkillMatch(userInput: string): Promise<string> {
const result = await invoke<RouteResultResponse>('route_intent', { userInput });
const suggestions = result?.suggestions;
if (!Array.isArray(suggestions) || suggestions.length === 0) return '';
const best = suggestions[0];
return `你可能需要:${best.display_name}${best.description}`;
}
const EMPTY_CONTEXT: SuggestionContext = { userProfile: '', painPoints: '', experiences: '', skillMatch: '' };
/**
* Fetch all intelligence context in parallel for suggestion enrichment.
* Returns empty strings for any source that fails — never throws.
*/
export async function fetchSuggestionContext(
agentId: string,
lastUserMessage: string,
): Promise<SuggestionContext> {
if (!isTauriAvailable()) {
return EMPTY_CONTEXT;
}
const [userProfile, painPoints, experiences, skillMatch] = await Promise.all([
withTimeout(fetchUserProfile(agentId).catch(e => { log.warn('User profile fetch failed:', e); return ''; }), CONTEXT_FETCH_TIMEOUT),
withTimeout(fetchPainPoints(agentId).catch(e => { log.warn('Pain points fetch failed:', e); return ''; }), CONTEXT_FETCH_TIMEOUT),
withTimeout(fetchExperiences(agentId, lastUserMessage).catch(e => { log.warn('Experiences fetch failed:', e); return ''; }), CONTEXT_FETCH_TIMEOUT),
withTimeout(fetchSkillMatch(lastUserMessage).catch(e => { log.warn('Skill match fetch failed:', e); return ''; }), CONTEXT_FETCH_TIMEOUT),
]);
return { userProfile: userProfile ?? '', painPoints: painPoints ?? '', experiences: experiences ?? '', skillMatch: skillMatch ?? '' };
}

View File

@@ -1,13 +1,13 @@
/** /**
* ArtifactStore — manages the artifact panel state. * ArtifactStore — manages the artifact panel state with IndexedDB persistence.
* *
* Extracted from chatStore.ts as part of the structured refactor. * Extracted from chatStore.ts as part of the structured refactor.
* This store has zero external dependencies — the simplest slice to extract. * Uses zustand/middleware persist + idb-storage for persistence across refreshes.
*
* @see docs/superpowers/specs/2026-04-02-chatstore-refactor-design.md §3.5
*/ */
import { create } from 'zustand'; import { create } from 'zustand';
import { persist, createJSONStorage } from 'zustand/middleware';
import { createIdbStorageAdapter } from '../../lib/idb-storage';
import type { ArtifactFile } from '../../components/ai/ArtifactPanel'; import type { ArtifactFile } from '../../components/ai/ArtifactPanel';
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -33,7 +33,9 @@ export interface ArtifactState {
// Store // Store
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
export const useArtifactStore = create<ArtifactState>()((set) => ({ export const useArtifactStore = create<ArtifactState>()(
persist(
(set) => ({
artifacts: [], artifacts: [],
selectedArtifactId: null, selectedArtifactId: null,
artifactPanelOpen: false, artifactPanelOpen: false,
@@ -51,4 +53,13 @@ export const useArtifactStore = create<ArtifactState>()((set) => ({
clearArtifacts: () => clearArtifacts: () =>
set({ artifacts: [], selectedArtifactId: null, artifactPanelOpen: false }), set({ artifacts: [], selectedArtifactId: null, artifactPanelOpen: false }),
})); }),
{
name: 'zclaw-artifact-storage',
storage: createJSONStorage(() => createIdbStorageAdapter()),
partialize: (state) => ({
artifacts: state.artifacts,
}),
},
),
);

View File

@@ -34,11 +34,16 @@ import {
} from './conversationStore'; } from './conversationStore';
import { useMessageStore } from './messageStore'; import { useMessageStore } from './messageStore';
import { useArtifactStore } from './artifactStore'; import { useArtifactStore } from './artifactStore';
import { llmSuggest } from '../../lib/llm-service'; import { llmSuggest, LLM_PROMPTS } from '../../lib/llm-service';
import { detectNameSuggestion, detectAgentNameSuggestion } from '../../lib/cold-start-mapper'; import { detectNameSuggestion, detectAgentNameSuggestion } from '../../lib/cold-start-mapper';
import { fetchSuggestionContext, type SuggestionContext } from '../../lib/suggestion-context';
const log = createLogger('StreamStore'); const log = createLogger('StreamStore');
// Module-level prefetch for suggestion context — started during streaming,
// consumed on stream completion. Saves ~0.5-1s vs fetching after stream ends.
let _activeSuggestionContextPrefetch: Promise<SuggestionContext> | null = null;
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Error formatting — convert raw LLM/API errors to user-friendly messages // Error formatting — convert raw LLM/API errors to user-friendly messages
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -214,6 +219,67 @@ class DeltaBuffer {
} }
} }
// ---------------------------------------------------------------------------
// Artifact creation from tool output (shared between sendMessage & agent stream)
// ---------------------------------------------------------------------------
const ARTIFACT_TYPE_MAP: Record<string, 'code' | 'markdown' | 'text' | 'table' | 'image'> = {
ts: 'code', tsx: 'code', js: 'code', jsx: 'code',
py: 'code', rs: 'code', go: 'code', java: 'code',
md: 'markdown', txt: 'text', json: 'code',
html: 'code', css: 'code', sql: 'code', sh: 'code',
yaml: 'code', yml: 'code', toml: 'code', xml: 'code',
csv: 'table', svg: 'image',
};
const ARTIFACT_LANG_MAP: Record<string, string> = {
ts: 'typescript', tsx: 'typescript', js: 'javascript', jsx: 'javascript',
py: 'python', rs: 'rust', go: 'go', java: 'java',
html: 'html', css: 'css', sql: 'sql', sh: 'bash',
json: 'json', yaml: 'yaml', yml: 'yaml', toml: 'toml',
xml: 'xml', csv: 'csv', md: 'markdown', txt: 'text',
};
/** Attempt to create an artifact from a completed tool call. */
function tryCreateArtifactFromToolOutput(toolName: string, toolInput: string, toolOutput: string): void {
if (!toolOutput) return;
const toolsWithArtifacts = ['file_write', 'write_file', 'str_replace', 'str_replace_editor'];
if (!toolsWithArtifacts.includes(toolName)) return;
try {
const parsed = JSON.parse(toolOutput);
const filePath = parsed?.path || parsed?.file_path || '';
let content = parsed?.content || '';
// For str_replace tools, content may be in input
if (!content && toolInput) {
try {
const inputParsed = JSON.parse(toolInput);
content = inputParsed?.new_text || inputParsed?.content || '';
} catch { /* ignore */ }
}
if (!filePath || !content) return;
// Deduplicate: skip if an artifact with the same path already exists
const existing = useArtifactStore.getState().artifacts;
if (existing.some(a => a.name === filePath.split('/').pop())) return;
const fileName = filePath.split('/').pop() || filePath;
const ext = fileName.split('.').pop()?.toLowerCase() || '';
useArtifactStore.getState().addArtifact({
id: `artifact_${Date.now()}`,
name: fileName,
content: typeof content === 'string' ? content : JSON.stringify(content, null, 2),
type: ARTIFACT_TYPE_MAP[ext] || 'text',
language: ARTIFACT_LANG_MAP[ext],
createdAt: new Date(),
});
} catch { /* non-critical: artifact creation from tool output */ }
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Stream event handlers (extracted from sendMessage) // Stream event handlers (extracted from sendMessage)
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -236,38 +302,8 @@ function createToolHandler(assistantId: string, chat: ChatStoreAccess) {
}) })
); );
// Auto-create artifact when file_write tool produces output // Auto-create artifact from tool output
if (tool === 'file_write') { tryCreateArtifactFromToolOutput(tool, input, output);
try {
const parsed = JSON.parse(output);
const filePath = parsed?.path || parsed?.file_path || '';
const content = parsed?.content || '';
if (filePath && content) {
const fileName = filePath.split('/').pop() || filePath;
const ext = fileName.split('.').pop()?.toLowerCase() || '';
const typeMap: Record<string, 'code' | 'markdown' | 'text'> = {
ts: 'code', tsx: 'code', js: 'code', jsx: 'code',
py: 'code', rs: 'code', go: 'code', java: 'code',
md: 'markdown', txt: 'text', json: 'code',
html: 'code', css: 'code', sql: 'code', sh: 'code',
};
const langMap: Record<string, string> = {
ts: 'typescript', tsx: 'typescript', js: 'javascript', jsx: 'javascript',
py: 'python', rs: 'rust', go: 'go', java: 'java',
html: 'html', css: 'css', sql: 'sql', sh: 'bash', json: 'json',
};
useArtifactStore.getState().addArtifact({
id: `artifact_${Date.now()}`,
name: fileName,
content: typeof content === 'string' ? content : JSON.stringify(content, null, 2),
type: typeMap[ext] || 'text',
language: langMap[ext],
createdAt: new Date(),
sourceStepId: assistantId,
});
}
} catch { /* non-critical: artifact creation from tool output */ }
}
} else { } else {
// toolStart: create new running step // toolStart: create new running step
const step: ToolCallStep = { const step: ToolCallStep = {
@@ -399,37 +435,51 @@ function createCompleteHandler(
} }
} }
// Async memory extraction (independent — failures don't block name detection) // Decoupled: suggestion generation runs immediately with prefetched context,
// memory extraction + reflection run independently in background.
const filtered = msgs const filtered = msgs
.filter(m => m.role === 'user' || m.role === 'assistant') .filter(m => m.role === 'user' || m.role === 'assistant')
.map(m => ({ role: m.role, content: m.content })); .map(m => ({ role: m.role, content: m.content }));
const convId = useConversationStore.getState().currentConversationId; const convId = useConversationStore.getState().currentConversationId;
getMemoryExtractor().extractFromConversation(filtered, agentId, convId ?? undefined)
.catch(err => log.warn('Memory extraction failed:', err));
intelligenceClient.reflection.recordConversation().catch(err => { // Build conversation messages for suggestions
log.warn('Recording conversation failed:', err);
});
intelligenceClient.reflection.shouldReflect().then(shouldReflect => {
if (shouldReflect) {
intelligenceClient.reflection.reflect(agentId, []).catch(err => {
log.warn('Reflection failed:', err);
});
}
});
// Follow-up suggestions (async LLM call with keyword fallback)
const latestMsgs = chat.getMessages() || []; const latestMsgs = chat.getMessages() || [];
const conversationMessages = latestMsgs const conversationMessages = latestMsgs
.filter(m => m.role === 'user' || m.role === 'assistant') .filter(m => m.role === 'user' || m.role === 'assistant')
.filter(m => !m.streaming) .filter(m => !m.streaming)
.map(m => ({ role: m.role, content: m.content })); .map(m => ({ role: m.role, content: m.content }));
generateLLMSuggestions(conversationMessages, set).catch(err => { // Consume prefetched context (started in sendMessage during streaming)
const prefetchPromise = _activeSuggestionContextPrefetch;
_activeSuggestionContextPrefetch = null;
// Fire suggestion generation immediately — don't wait for memory extraction
const fireSuggestions = (ctx?: SuggestionContext) => {
generateLLMSuggestions(conversationMessages, set, ctx).catch(err => {
log.warn('Suggestion generation error:', err); log.warn('Suggestion generation error:', err);
set({ suggestionsLoading: false }); set({ suggestionsLoading: false });
}); });
}; };
if (prefetchPromise) {
prefetchPromise.then(fireSuggestions).catch(() => fireSuggestions());
} else {
fireSuggestions();
}
// Background tasks run independently — never block suggestions
getMemoryExtractor().extractFromConversation(filtered, agentId, convId ?? undefined)
.catch(err => log.warn('Memory extraction failed:', err));
intelligenceClient.reflection.recordConversation()
.catch(err => log.warn('Recording conversation failed:', err))
.then(() => intelligenceClient.reflection.shouldReflect())
.then(shouldReflect => {
if (shouldReflect) {
intelligenceClient.reflection.reflect(agentId, []).catch(err => {
log.warn('Reflection failed:', err);
});
}
}).catch(() => {});
};
} }
export interface StreamState { export interface StreamState {
@@ -559,15 +609,32 @@ function parseSuggestionResponse(raw: string): string[] {
async function generateLLMSuggestions( async function generateLLMSuggestions(
messages: Array<{ role: string; content: string }>, messages: Array<{ role: string; content: string }>,
set: (partial: Partial<StreamState>) => void, set: (partial: Partial<StreamState>) => void,
context?: SuggestionContext,
): Promise<void> { ): Promise<void> {
set({ suggestionsLoading: true }); set({ suggestionsLoading: true });
try { try {
const recentMessages = messages.slice(-6); const recentMessages = messages.slice(-20);
const context = recentMessages const conversationContext = recentMessages
.map(m => `${m.role === 'user' ? '用户' : '助手'}: ${m.content}`) .map(m => `${m.role === 'user' ? '用户' : '助手'}: ${m.content.slice(0, 200)}`)
.join('\n\n'); .join('\n\n');
// Build dynamic user message with intelligence context
const ctx = context ?? { userProfile: '', painPoints: '', experiences: '', skillMatch: '' };
const hasContext = ctx.userProfile || ctx.painPoints || ctx.experiences || ctx.skillMatch;
let userMessage: string;
if (hasContext) {
const sections: string[] = ['以下是用户的背景信息,请在生成建议时参考:\n'];
if (ctx.userProfile) sections.push(`## 用户画像\n${ctx.userProfile}`);
if (ctx.painPoints) sections.push(`## 活跃痛点\n${ctx.painPoints}`);
if (ctx.experiences) sections.push(`## 相关经验\n${ctx.experiences}`);
if (ctx.skillMatch) sections.push(`## 可用技能\n${ctx.skillMatch}`);
sections.push(`\n最近对话\n${conversationContext}`);
userMessage = sections.join('\n\n');
} else {
userMessage = `以下是对话中最近的消息:\n\n${conversationContext}\n\n请生成 3 个后续问题。`;
}
const connectionMode = typeof localStorage !== 'undefined' const connectionMode = typeof localStorage !== 'undefined'
? localStorage.getItem('zclaw-connection-mode') ? localStorage.getItem('zclaw-connection-mode')
: null; : null;
@@ -575,9 +642,9 @@ async function generateLLMSuggestions(
let raw: string; let raw: string;
if (connectionMode === 'saas') { if (connectionMode === 'saas') {
raw = await llmSuggestViaSaaS(context); raw = await llmSuggestViaSaaS(userMessage);
} else { } else {
raw = await llmSuggest(context); raw = await llmSuggest(userMessage);
} }
const suggestions = parseSuggestionResponse(raw); const suggestions = parseSuggestionResponse(raw);
@@ -601,7 +668,7 @@ async function generateLLMSuggestions(
* with non-streaming requests. Collects the full response from SSE deltas, * with non-streaming requests. Collects the full response from SSE deltas,
* then parses the suggestion JSON from the accumulated text. * then parses the suggestion JSON from the accumulated text.
*/ */
async function llmSuggestViaSaaS(context: string): Promise<string> { async function llmSuggestViaSaaS(userMessage: string): Promise<string> {
const { saasClient } = await import('../../lib/saas-client'); const { saasClient } = await import('../../lib/saas-client');
const { useConversationStore } = await import('./conversationStore'); const { useConversationStore } = await import('./conversationStore');
const { useSaaSStore } = await import('../saasStore'); const { useSaaSStore } = await import('../saasStore');
@@ -611,9 +678,6 @@ async function llmSuggestViaSaaS(context: string): Promise<string> {
const model = currentModel || (availableModels.length > 0 ? availableModels[0]?.id : undefined); const model = currentModel || (availableModels.length > 0 ? availableModels[0]?.id : undefined);
if (!model) throw new Error('No model available for suggestions'); if (!model) throw new Error('No model available for suggestions');
// Delay to avoid concurrent relay requests with memory extraction
await new Promise(r => setTimeout(r, 2000));
const controller = new AbortController(); const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 60000); const timeoutId = setTimeout(() => controller.abort(), 60000);
@@ -623,7 +687,7 @@ async function llmSuggestViaSaaS(context: string): Promise<string> {
model, model,
messages: [ messages: [
{ role: 'system', content: LLM_PROMPTS_SYSTEM }, { role: 'system', content: LLM_PROMPTS_SYSTEM },
{ role: 'user', content: `以下是对话中最近的消息:\n\n${context}\n\n请生成 3 个后续问题。` }, { role: 'user', content: userMessage },
], ],
max_tokens: 500, max_tokens: 500,
temperature: 0.7, temperature: 0.7,
@@ -664,17 +728,7 @@ async function llmSuggestViaSaaS(context: string): Promise<string> {
} }
} }
const LLM_PROMPTS_SYSTEM = `你是对话分析助手。根据最近的对话内容,生成 3 个用户可能想继续探讨的问题。 const LLM_PROMPTS_SYSTEM = LLM_PROMPTS.suggestions.system;
要求:
- 每个问题必须与对话内容直接相关,具体且有针对性
- 帮助用户深入理解、实际操作或拓展思路
- 每个问题不超过 30 个中文字符
- 不要重复对话中已讨论过的内容
- 使用与用户相同的语言
只输出 JSON 数组,包含恰好 3 个字符串。不要输出任何其他内容。
示例:["如何在生产环境中部署?", "这个方案的成本如何?", "有没有更简单的替代方案?"]`;
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// ChatStore injection (avoids circular imports) // ChatStore injection (avoids circular imports)
@@ -786,6 +840,9 @@ export const useStreamStore = create<StreamState>()(
}); });
set({ isStreaming: true, activeRunId: null }); set({ isStreaming: true, activeRunId: null });
// Prefetch suggestion context during streaming — saves ~0.5-1s post-stream
_activeSuggestionContextPrefetch = fetchSuggestionContext(agentId, content);
// Delta buffer — batches updates at ~60fps // Delta buffer — batches updates at ~60fps
const buffer = new DeltaBuffer(assistantId, _chat); const buffer = new DeltaBuffer(assistantId, _chat);
@@ -1001,6 +1058,13 @@ export const useStreamStore = create<StreamState>()(
return { ...m, toolSteps: steps }; return { ...m, toolSteps: steps };
}) })
); );
// Auto-create artifact from tool output (agent stream path)
tryCreateArtifactFromToolOutput(
delta.tool || 'unknown',
delta.toolInput || '',
delta.toolOutput,
);
} else { } else {
// toolStart: create new running step // toolStart: create new running step
const step: ToolCallStep = { const step: ToolCallStep = {
@@ -1059,10 +1123,20 @@ export const useStreamStore = create<StreamState>()(
.filter(m => !m.streaming) .filter(m => !m.streaming)
.map(m => ({ role: m.role, content: m.content })); .map(m => ({ role: m.role, content: m.content }));
generateLLMSuggestions(conversationMessages, set).catch(err => { // Path B: use prefetched context for agent stream — fixes zero-personalization
const prefetchPromise = _activeSuggestionContextPrefetch;
_activeSuggestionContextPrefetch = null;
const fireSuggestions = (ctx?: SuggestionContext) => {
generateLLMSuggestions(conversationMessages, set, ctx).catch(err => {
log.warn('Suggestion generation error:', err); log.warn('Suggestion generation error:', err);
set({ suggestionsLoading: false }); set({ suggestionsLoading: false });
}); });
};
if (prefetchPromise) {
prefetchPromise.then(fireSuggestions).catch(() => fireSuggestions());
} else {
fireSuggestions();
}
} }
} }
} else if (delta.stream === 'hand') { } else if (delta.stream === 'hand') {

View File

@@ -0,0 +1,309 @@
# 产物系统参考文档
> 调研 DeerFlow 和 Hermes Agent 的产物/输出面板实现,为 ZCLAW 产物系统重构提供参考。
> 分析日期2026-04-24
---
## 一、DeerFlow 产物系统
DeerFlow 有完整的全栈产物管道,是主要参考对象。
### 1.1 端到端数据流
```
Agent tool call (write_file / str_replace / present_files)
Backend: ThreadState.artifacts (LangGraph annotated list, merge_artifacts reducer 去重)
↓ 文件写入: {base_dir}/threads/{thread_id}/user-data/outputs/
↓ 虚拟路径: /mnt/user-data/outputs/filename.ext
Backend API: GET /api/threads/{thread_id}/artifacts/{virtual_path}
↓ MIME 检测 / .skill ZIP 解压 / download vs inline
Frontend: thread.values.artifacts (string[]) → ArtifactsProvider context
ChatBox (ResizablePanelGroup) → chat(60%) | artifact panel(40%)
ArtifactFileDetail → CodeMirror(代码) / Streamdown(Markdown) / iframe(HTML)
```
### 1.2 关键文件
#### 前端核心
| 文件 | 职责 |
|------|------|
| `frontend/src/core/artifacts/utils.ts` | URL 构建、产物列表提取、路径解析 |
| `frontend/src/core/artifacts/loader.ts` | 从后端 API 获取产物文本;从 tool call args 直接提取内容 |
| `frontend/src/core/artifacts/hooks.ts` | TanStack React Query hook5 分钟缓存 |
| `frontend/src/components/workspace/artifacts/context.tsx` | ArtifactsProvider + useArtifacts() — 管理列表、选中、开关、自动选中 |
| `frontend/src/components/workspace/artifacts/artifact-file-detail.tsx` | 产物详情视图:头部(文件选择器+code/preview切换) + CodeEditor/Preview |
| `frontend/src/components/workspace/artifacts/artifact-file-list.tsx` | 卡片式列表视图,每个卡片含图标/名称/扩展名/下载/安装按钮 |
| `frontend/src/components/workspace/artifacts/artifact-trigger.tsx` | 头部触发按钮,仅在产物存在时显示 |
#### 前端渲染
| 文件 | 职责 |
|------|------|
| `frontend/src/components/workspace/code-editor.tsx` | CodeMirror 只读编辑器,支持 CSS/HTML/JS/JSON/MD/Python 语法高亮 |
| `frontend/src/components/ai-elements/code-block.tsx` | Shiki 语法高亮代码块,双主题(light/dark) |
| `frontend/src/components/ai-elements/web-preview.tsx` | iframe 网页预览,含地址栏和导航按钮 |
| `frontend/src/components/workspace/messages/markdown-content.tsx` | Streamdown 渲染 Markdown (GFM + Math + Raw HTML + KaTeX) |
| `frontend/src/core/utils/files.tsx` | 140+ 扩展名→语言映射,文件图标/类型判断 |
#### 后端
| 文件 | 职责 |
|------|------|
| `backend/.../thread_state.py` | ThreadState.artifacts 列表 + merge_artifacts 去重 reducer |
| `backend/.../present_file_tool.py` | present_files 工具 — 标准化路径,返回 Command(update) |
| `backend/.../paths.py` | 路径管理threads/{id}/user-data/{workspace,uploads,outputs} |
| `backend/app/gateway/routers/artifacts.py` | FastAPI 路由GET 产物文件MIME 检测,安全处理 |
### 1.3 支持的内容类型
| 类型 | 渲染方式 |
|------|----------|
| 代码文件 (140+ 扩展名) | CodeMirror 只读 + 语法高亮 |
| Markdown (.md) | Streamdown (GFM + Math + KaTeX + Raw HTML) |
| HTML (.html/.htm) | 沙箱 `<iframe>` (srcDoc) |
| 图片 (.png/.jpg/.svg/.webp) | `<img>` 标签,非代码文件用 iframe |
| .skill 压缩包 | ZIP 解压SKILL.md 渲染为 Markdown |
| 二进制文件 (PDF 等) | 后端 inline Content-Disposition |
| 文本文件 (.txt/.csv/.log) | CodeMirror 纯文本模式 |
### 1.4 持久化架构
**磁盘存储:**
```
{DEER_FLOW_HOME}/threads/{thread_id}/user-data/outputs/
```
**状态持久化:** artifacts 列表是 LangGraph ThreadState 的一部分,由 checkpoint 系统自动持久化。
**前端缓存:** TanStack React Query5 分钟 stale time。
### 1.5 UI/UX 设计模式
#### 分栏布局 (chat-box.tsx)
- `react-resizable-panels` 水平分栏
- 关闭态chat=100%, artifacts=0%
- 打开态chat=60%, artifacts=40%
- 300ms CSS 过渡动画
#### 自动打开 + 自动选中
- 检测到 `write_file` / `str_replace` tool call 时自动打开面板并选中文件
- `autoOpen` / `autoSelect` 标志防止用户手动关闭后重复打开
#### 代码/预览切换
- HTML/Markdown 默认 Preview其他默认 Code
- Preview 用 Streamdown(MD) 或 iframe(HTML)
#### 头部操作栏
- 文件选择器下拉菜单(不用返回列表即可切换)
- 复制 / 下载 / 新窗口打开 / 关闭
#### 聊天内嵌展示
- `present_files` tool call → 聊天流内渲染卡片网格
- 点击卡片 → 侧栏打开该文件
#### 双路径方案
1. **真实文件路径** — 从后端 API 获取React Query 缓存
2. **`write-file:` 虚拟路径** — 直接从 tool call args 提取内容,无需后端请求,支持流式显示
### 1.6 Provider 层级
```
ArtifactsProvider → 提供useArtifacts() context
ChatBox → ResizablePanelGroup
Panel(chat) → MessageList → ToolCall 自动打开产物面板
Panel(artifacts) → ArtifactFileDetail → useArtifactContent() hook
```
---
## 二、Hermes Agent 产物机制
> **结论Hermes Agent 无产物面板、无 Web 前端、无分栏布局。** 它是终端 CLI 工具,所有输出在终端内联渲染。但有值得借鉴的大输出处理机制。
### 2.1 项目定位
Hermes Agent 是 **Python CLI/TUI Agent**(类似 Claude Code通过 prompt_toolkit TUI 运行,同时支持 Telegram/Discord/Slack/WhatsApp 等 IM 平台网关。
**无 React/Next.js/Web UI。** 暴露 OpenAI 兼容 API 供 Open WebUI/LobeChat 等第三方 UI 接入。
### 2.2 大输出处理3 层防御)
这是唯一接近"产物管理"的机制,值得借鉴。
**文件:`tools/tool_result_storage.py`**
| 层级 | 机制 | 说明 |
|------|------|------|
| Layer 1 | 工具自身截断 | 每个工具限制自己的输出长度 |
| Layer 2 | `maybe_persist_tool_result` | 单个结果超阈值 → 写入沙箱临时文件,上下文中替换为 `<persisted-output>` 预览块 |
| Layer 3 | `enforce_turn_budget` | 整轮超过 200K 字符 → 最大的几个溢出到磁盘 |
核心逻辑:
```python
# 超阈值时:完整内容写入文件,上下文替换为预览
remote_path = f"{storage_dir}/{tool_use_id}.txt"
_write_to_sandbox(content, remote_path, env)
return _build_persisted_message(preview, has_more, len(content), remote_path)
# 后续 agent 可用 read_file + offset/limit 读取完整内容
```
### 2.3 预算配置
**文件:`tools/budget_config.py`**
| 参数 | 默认值 |
|------|--------|
| `DEFAULT_RESULT_SIZE_CHARS` | 100,000单工具阈值|
| `DEFAULT_TURN_BUDGET_CHARS` | 200,000整轮上限|
| `DEFAULT_PREVIEW_SIZE_CHARS` | 1,500内联预览长度|
### 2.4 CLI 渲染方式
**文件:`agent/display.py`**
- **工具进度**KawaiiSpinner 动画 + 一行摘要
- **文件编辑**:内联 colored unified diffwrite_file / patch 工具)
- **最终响应**Rich Panel 边框包裹主题色可换7 套 skin
### 2.5 会话持久化
**文件:`hermes_state.py`**
SQLite (`~/.hermes/state.db`) + FTS5 全文搜索:
- sessions 表元数据、模型配置、token 计数、费用、标题
- messages 表role、content、tool_call_id、reasoning、时间戳
### 2.6 值得借鉴的点
| 点 | 借鉴价值 |
|----|----------|
| 大输出溢出到磁盘 + 内联预览 | 解决 context window 溢出问题 |
| 3 层递进防御 | 对 ZCLAW 中间件链有参考价值 |
| 预算配置化 | 阈值可调,不同场景不同策略 |
---
## 三、对比分析ZCLAW 现状 vs 参考方案
### 3.1 现状差距
| 维度 | DeerFlow | ZCLAW 现状 | 差距 |
|------|----------|------------|------|
| 数据源 | 3 个工具(present_files/write_file/str_replace)主动注册 | 仅 streamStore 解析 tool output 的 filePath | 极窄,几乎不触发 |
| 持久化 | 磁盘文件 + LangGraph checkpoint | 纯内存 Zustand | 刷新即丢失 |
| 渲染-代码 | CodeMirror 只读 + 语法高亮 (140+ 语言) | 纯 `<pre>` 标签,无高亮 | 无高亮 |
| 渲染-Markdown | Streamdown (GFM+Math+KaTeX+RawHTML) | 手写 30 行正则渲染器 | 仅标题/粗体/列表 |
| 渲染-HTML | 沙箱 iframe | 不支持 | 无 |
| 渲染-图片 | `<img>` + iframe | 类型声明了无实现 | 无 |
| 渲染-表格 | GFM 表格 | 纯文本 `<pre>` | 无 |
| 面板布局 | react-resizable-panels 60/40 | react-resizable-panels 65/35 | 已有,可复用 |
| 自动打开 | write_file/str_replace 触发 | addArtifact 时打开 | 已有 |
| 文件选择 | 下拉菜单不离开详情视图 | 必须返回列表再选 | 体验差 |
| 聊天内嵌 | present_files → 卡片网格 | 无 | 缺失 |
| 缓存 | React Query 5min | 无 | 缺失 |
| 双路径 | 真实路径 + write-file: 虚拟路径 | 仅运行时内存 | 缺失 |
| 右面板重叠 | 单一面板 | ArtifactPanel + RightPanel"文件"tab 职责交叉 | 架构问题 |
### 3.2 核心差距总结
**按优先级排列:**
1. **P0 数据源断裂** — 产物几乎没有来源,是最根本的问题
2. **P0 无持久化** — 产物刷新即丢
3. **P1 Markdown 渲染残缺** — 30 行正则 vs 完整 GFM 渲染器
4. **P1 代码无语法高亮** — 纯 `<pre>` vs CodeMirror/Shiki
5. **P2 双面板职责交叉** — ArtifactPanel vs RightPanel"文件"tab
6. **P2 缺少详情内文件切换** — 需返回列表才能切换文件
7. **P3 聊天内嵌产物卡片缺失**
8. **P3 HTML/图片/表格渲染缺失**
### 3.3 推荐方案
#### 方案 A最小可行基于现有架构补全
在现有 ArtifactPanel + artifactStore 上修补:
- **数据源**:扩展 streamStore 中的 tool output 解析,覆盖更多工具类型
- **持久化**artifactStore 追加 IndexedDB 写入(复用 messageStore 模式)
- **Markdown**:引入 `react-markdown` + `remark-gfm` 替换手写渲染器
- **代码高亮**:引入 `shiki``highlight.js`
- **合并面板**RightPanel "文件"tab 功能合并到 ArtifactPanel删除 RightPanel 的 files tab
**工作量**~2-3 天
#### 方案 B参照 DeerFlow 重构(推荐)
借鉴 DeerFlow 架构但适配 ZCLAW Tauri 本地架构:
| DeerFlow 组件 | ZCLAW 适配 |
|---------------|------------|
| FastAPI 产物路由 | Tauri 命令 `artifact_list` / `artifact_read` / `artifact_serve` |
| 磁盘 outputs/ 目录 | `{workspace}/artifacts/{session_key}/` |
| LangGraph checkpoint | SQLite (已有 zclaw-memory) |
| React Query 缓存 | TanStack Query 或 Zustand + stale cache |
| CodeMirror 只读 | 引入 @uiw/react-codemirror |
| Streamdown MD | react-markdown + remark-gfm + rehype-katex |
| iframe HTML 预览 | Tauri webview window (安全隔离) |
**核心改动清单:**
1. **Rust 侧**zclaw-kernel
- 新增 `artifact_create` / `artifact_list` / `artifact_read` Tauri 命令
- 产物写入 `{workspace}/artifacts/{session_key}/`
- 中间件链中 ToolEnd 事件触发产物注册
2. **前端 Store**
- artifactStore 增加 IndexedDB 持久化
- 从 streamStore 解耦产物创建逻辑到独立 hook
3. **前端组件**
- 替换 MarkdownPreview → react-markdown + GFM
- 引入 CodeMirror/shiki 代码高亮
- 详情视图增加文件下拉切换
- RightPanel "文件" tab 合并或移除
**工作量**~5-7 天
#### 方案 C借鉴 Hermes 防御机制(附加)
无论选 A 还是 B都可叠加 Hermes 的大输出防御:
- 中间件链 ToolOutputGuard 层增加溢出检测
- 超阈值产物自动持久化到磁盘,上下文替换为 `<persisted-output>` 预览
- agent 可通过 read_file 回读完整内容
---
## 四、关键依赖库参考
| 库 | 用途 | DeerFlow 使用 | 推荐 |
|----|------|--------------|------|
| react-markdown | Markdown 渲染 | ✅ (Streamdown) | ✅ |
| remark-gfm | GFM 表格/删除线/任务列表 | ✅ | ✅ |
| rehype-katex | 数学公式渲染 | ✅ | 按需 |
| @uiw/react-codemirror | 代码编辑器/高亮 | ✅ | ✅ |
| shiki | 静态代码高亮 | ✅ (chat 内代码块) | ✅ |
| react-resizable-panels | 分栏布局 | ✅ | 已有 |
| @tanstack/react-query | 数据缓存 | ✅ | 可选 |
---
## 五、文件索引
| 参考项目 | 关键路径 |
|----------|----------|
| DeerFlow 前端 | `G:/deerflow/frontend/src/components/workspace/artifacts/` |
| DeerFlow 前端工具 | `G:/deerflow/frontend/src/core/artifacts/` |
| DeerFlow 布局 | `G:/deerflow/frontend/src/components/workspace/chats/chat-box.tsx` |
| DeerFlow 代码编辑 | `G:/deerflow/frontend/src/components/workspace/code-editor.tsx` |
| DeerFlow 后端路由 | `G:/deerflow/backend/app/gateway/routers/artifacts.py` |
| DeerFlow 后端工具 | `G:/deerflow/backend/packages/harness/deerflow/tools/builtins/present_file_tool.py` |
| Hermes 输出管理 | `G:/hermes-agent-main/tools/tool_result_storage.py` |
| Hermes 预算配置 | `G:/hermes-agent-main/tools/budget_config.py` |

View File

@@ -0,0 +1,212 @@
# DeerFlow 工具调用系统参考文档
> 调研 DeerFlow 的工具调用完整流程,为 ZCLAW 工具调用问题排查提供参考。
> 分析日期2026-04-24
---
## 一、端到端数据流
```
用户消息
→ FastAPI Gateway (/api/threads/{id}/runs/stream)
→ services.start_run() → asyncio.create_task(run_agent(...))
→ LangGraph Agent Graph (create_agent)
→ LLM Model (ChatOpenAI / Claude)
→ AIMessage (含 tool_calls 列表)
→ 14 层 Middleware 链处理
→ ToolNode (LangGraph 内置, 按 tool_call.name 路由)
→ ToolMessage (执行结果)
→ 再次调用 LLM (带着 ToolMessage 继续)
→ StreamBridge.publish() → asyncio.Queue
→ SSE → 前端 useStream hook
→ React 组件渲染
```
## 二、工具注册与执行
### 2.1 注册入口
**文件**: `G:/deerflow/backend/packages/harness/deerflow/tools/tools.py``get_available_tools()`
工具来自四个来源:
| 来源 | 加载方式 | 示例 |
|------|----------|------|
| Config 工具 | YAML 配置 + 反射导入 (`module:variable`) | `deerflow.sandbox.tools:bash_tool` |
| Builtin 工具 | 硬编码导入 | `present_file_tool`, `ask_clarification_tool` |
| MCP 工具 | `MultiServerMCPClient` 从 MCP 服务器缓存获取 | 第三方 MCP 工具 |
| ACP 工具 | `build_invoke_acp_agent_tool()` 动态构建 | 外部 agent 调用 |
### 2.2 Sandbox 工具清单
**文件**: `G:/deerflow/backend/packages/harness/deerflow/sandbox/tools.py`
| 工具名 | 功能 |
|--------|------|
| `bash` | 沙箱中执行命令 |
| `ls` | 列出目录 |
| `read_file` | 读取文件 |
| `write_file` | 写入文件(触发产物面板自动打开) |
| `str_replace` | 字符串替换(触发产物面板自动打开) |
### 2.3 Builtin 工具
**文件**: `G:/deerflow/backend/packages/harness/deerflow/tools/builtins/`
| 工具 | 功能 |
|------|------|
| `ask_clarification` | 向用户提问澄清(中断执行等待回复) |
| `present_file` | 展示文件给用户(触发产物卡片) |
| `setup_agent` | 自定义 agent 创建 |
| `task_tool` | 子 agent 任务委派 |
| `view_image` | 图片查看(仅视觉模型) |
| `tool_search` | 延迟工具搜索MCP 工具按需暴露) |
## 三、中间件链14 层)
**文件**: `G:/deerflow/backend/packages/harness/deerflow/agents/lead_agent/agent.py``_build_middlewares()`
与工具调用相关的关键中间件:
### 3.1 DanglingToolCallMiddleware
**文件**: `dangling_tool_call_middleware.py`
`wrap_model_call` 中检测消息历史中缺失 ToolMessage 的 AIMessage自动注入占位 ToolMessage
```python
ToolMessage(
content="[Tool call was interrupted and did not return a result.]",
tool_call_id=tc_id,
name=tc.get("name", "unknown"),
status="error",
)
```
### 3.2 ToolErrorHandlingMiddleware
**文件**: `tool_error_handling_middleware.py`
`wrap_tool_call` 中捕获工具执行异常,转换为错误 ToolMessage 而非让整个 run 崩溃。
### 3.3 LoopDetectionMiddleware
**文件**: `loop_detection_middleware.py`
`after_model` 中检测重复工具调用:
- 阈值 3 次 → 注入警告 HumanMessage
- 阈值 5 次 → 直接清空 tool_calls强制 LLM 产出文本回答
### 3.4 DeferredToolFilterMiddleware
**文件**: `deferred_tool_filter_middleware.py`
`wrap_model_call` 中过滤延迟注册的 MCP 工具 schema仅在 LLM 通过 `tool_search` 发现后才暴露。
### 3.5 ClarificationMiddleware
拦截 `ask_clarification` 工具调用,中断执行等待用户回复。
### 3.6 SubagentLimitMiddleware
截断过多的并行子 agent 调用。
## 四、工具结果回传
### 4.1 格式
LangChain 的 `ToolMessage`,包含:
- `content`: 执行结果文本
- `tool_call_id`: 匹配 AIMessage 中的 tool_call ID
- `name`: 工具名称
- `status`: `"error"` 或省略
### 4.2 特殊工具
`present_file_tool` 返回 `Command` 而非纯字符串,同时更新 `artifacts``messages` 两个 state channel。
## 五、前端工具调用展示
### 5.1 消息分组
**文件**: `G:/deerflow/frontend/src/core/messages/utils.ts``groupMessages()`
| 分组类型 | 触发条件 | 展示 |
|----------|----------|------|
| `assistant:processing` | AI 消息含 tool_calls 或 reasoning | MessageGroup (折叠) |
| `assistant` | AI 消息有文本无 tool_calls | MessageListItem (气泡) |
| `assistant:present-files` | 含 present_files tool call | ArtifactFileList |
| `assistant:clarification` | ask_clarification 结果 | MarkdownContent |
| `assistant:subagent` | 含 task tool call | SubtaskCard |
### 5.2 工具状态推断
前端**没有显式状态机**。通过消息序列推断:
- AI 消息含 tool_calls 但无对应 ToolMessage → 正在执行
- ToolMessage 出现 → 执行完成
- `assistant:processing` 组由 `ChainOfThought` 折叠组件包裹
### 5.3 工具调用 UI
**文件**: `message-group.tsx` 第 186-423 行
按工具名渲染不同图标和内容:
- `bash` → 终端图标 + 命令代码块
- `read_file`/`write_file`/`str_replace` → 文件图标 + 路径链接(点击打开产物面板)
- `web_search` → 搜索图标 + 结果链接
- 默认 → 扳手图标 + 工具名
## 六、流式处理中的工具调用
### 6.1 架构
```
agent.astream(stream_mode=["values"])
→ StreamBridge (asyncio.Queue per run, maxsize=256)
→ sse_consumer() → SSE frames → 前端
```
### 6.2 关键特征
- 工具调用**不中断**流。LangGraph 自动在 agent_node 和 tool_node 之间路由
- 每次状态变更产出完整的 `values` 快照,前端通过 `seen_ids` 去重
- 15 秒心跳包保持 SSE 连接
### 6.3 前端看到的事件序列
1. `values` 事件: 含 `tool_calls` 的 AIMessage
2. `values` 事件: ToolMessage工具结果
3. `values` 事件: LLM 基于工具结果的最终回答
整个过程连续,不中断 SSE 连接。
## 七、与 ZCLAW 对比(工具调用)
| 维度 | DeerFlow | ZCLAW |
|------|----------|-------|
| 框架 | LangGraph (graph-based) | 自研 loop_runner (循环) |
| 工具生命周期 | LangGraph ToolNode 自动管理 | 手动 ToolRegistry + loop_runner |
| after_tool_call 中间件 | ✅ wrap_tool_call 钩子完整 | ❌ 流式和非流式模式均未调用 |
| 并行工具执行 | LangGraph 自动处理 | 非流式有 JoinSet流式全串行 |
| 悬挂修复 | DanglingToolCallMiddleware | DanglingToolMiddleware (有) |
| 错误恢复 | ToolErrorHandlingMiddleware (异常→ToolMessage) | ToolErrorMiddleware (计数器) |
| 循环检测 | LoopDetectionMiddleware (3次警告/5次强停) | LoopGuardMiddleware (有) |
| 前端状态 | 消息序列推断 | 显式 ToolCallStep 状态机 |
| MCP 工具 | 延迟注册 + tool_search 按需暴露 | 全量注册 |
## 八、关键文件索引
| 功能 | DeerFlow 文件 |
|------|-------------|
| Agent 工厂 | `backend/packages/harness/deerflow/agents/lead_agent/agent.py` |
| 中间件组装 | `backend/packages/harness/deerflow/agents/factory.py` |
| 工具注册 | `backend/packages/harness/deerflow/tools/tools.py` |
| Sandbox 工具 | `backend/packages/harness/deerflow/sandbox/tools.py` |
| Builtin 工具 | `backend/packages/harness/deerflow/tools/builtins/` |
| 错误处理中间件 | `agents/middlewares/tool_error_handling_middleware.py` |
| 悬挂修复中间件 | `agents/middlewares/dangling_tool_call_middleware.py` |
| 循环检测中间件 | `agents/middlewares/loop_detection_middleware.py` |
| 延迟过滤中间件 | `agents/middlewares/deferred_tool_filter_middleware.py` |
| 流式 Bridge | `runtime/stream_bridge/memory.py` |
| 前端消息分组 | `frontend/src/core/messages/utils.ts` |
| 前端工具调用组件 | `frontend/src/components/workspace/messages/message-group.tsx` |

View File

@@ -0,0 +1,141 @@
# ZCLAW 工具调用问题分析
> 对比 DeerFlow 工具调用系统,排查 ZCLAW 工具调用问题。
> 分析日期2026-04-24
> 更新日期2026-04-24P0+P0-stream_errored 已修复)
---
## 一、发现的问题
### P0: `after_tool_call` 中间件从未被调用 — ✅ 已修复 (2026-04-24)
**文件**: `crates/zclaw-runtime/src/loop_runner.rs`
`run()`(非流式,第 400-558 行)和 `run_streaming`(流式,第 893-1070 行)中,工具执行后直接 push `Message::tool_result` 到消息历史,**没有调用 `middleware_chain.run_after_tool_call()`**。
**影响**:
- `ToolErrorMiddleware.after_tool_call` 的错误计数和恢复消息逻辑不生效
- `ToolOutputGuardMiddleware.after_tool_call` 的敏感信息检测不生效
- 工具错误只能靠工具自身的错误返回传递,中间件层的防护形同虚设
**DeerFlow 对比**: `ToolErrorHandlingMiddleware` 通过 `wrap_tool_call` 钩子完整包裹每次工具执行。
### P0: `stream_errored` 跳过所有工具执行 — ✅ 已修复 (2026-04-24)
**文件**: `crates/zclaw-runtime/src/loop_runner.rs` 第 872-876 行
流式模式中,当 LLM 流出现任何错误网络超时、API 错误、驱动错误)时,`stream_errored = true`,然后 `break 'outer` 直接退出循环,**跳过所有已解析的工具调用**。
**影响**:
- ToolStart 事件已发送给前端(用户看到"执行中"按钮),但工具从未实际执行
- ToolEnd 事件永远不会发送 → 前端工具状态卡在"执行中"
- 已完整接收ToolUseEnd的工具调用也被丢弃
**修复**: 区分完整工具(收到 ToolUseEnd和不完整工具仅收到 ToolUseStart/Delta。完整工具照常执行不完整工具发送取消 ToolEnd 事件。
### P1: 流式模式工具全串行 — ✅ 已修复 (2026-04-24)
**文件**: `loop_runner.rs` 流式模式工具执行段
非流式模式有 `JoinSet` + `Semaphore(3)` 并行执行 ReadOnly 工具,但流式模式用简单 `for` 循环串行执行所有工具。
**修复**: 流式模式采用三阶段执行Phase 1 中间件预检(serial) → Phase 2 并行+串行分区执行 → Phase 3 after_tool_call + 结果排序推送。
### P2: OpenAI 驱动工具参数静默替换 — ✅ 已修复 (2026-04-24)
**文件**: `crates/zclaw-runtime/src/driver/openai.rs` 第 222-228 行
```rust
let parsed_args = if args.is_empty() {
serde_json::json!({})
} else {
serde_json::from_str(args).unwrap_or_else(|e| {
tracing::warn!("Failed to parse tool args '{}': {}", args, e);
serde_json::json!({})
})
};
```
JSON 解析失败时静默替换为 `{}`,结合 loop_runner.rs 的空参数处理(第 412-423 行),会注入 `_fallback_query` 替代实际参数。
**修复**: 解析失败时返回 `_parse_error` + `_raw_args` 字段,让工具和 LLM 能感知到参数问题并自我修正。
### P2: ToolOutputGuard 过于激进 — ✅ 已修复 (2026-04-24)
**文件**: `crates/zclaw-runtime/src/middleware/tool_output_guard.rs` 第 109 行
使用 `to_lowercase()` 匹配敏感模式,合法内容中包含 "password"、"system:" 等字符串会被误拦。
**修复**: 改用 `regex` 精确匹配实际密钥值格式(如 `sk-[a-zA-Z0-9]{20,}``AKIA[A-Z0-9]{16}``key=value` 模式),不再误拦仅包含关键词的合法内容。移除了 "system:" 等过于宽泛的注入检测模式。
### P2: ToolErrorMiddleware 失败计数器是全局的 — ✅ 已修复 (2026-04-24)
**文件**: `crates/zclaw-runtime/src/middleware/tool_error.rs` 第 27 行
`consecutive_failures: AtomicU32` 是结构体字段,所有 session 共享。高并发下 A session 失败 2 次 + B session 失败 1 次就会触发 AbortLoop阈值 3
**修复**: 改用 `Mutex<HashMap<String, u32>>` 以 session_id 为 key 存储计数,每个会话独立跟踪。
### P3: Gateway 客户端 onTool 回调语义不一致 — ✅ 已修复 (2026-04-24)
**文件**: `desktop/src/lib/gateway-client.ts` 第 698-707 行
`tool_call``tool_result` 两个 case 共用 `onTool` 回调,但参数约定不同,调用者必须通过 `output` 是否为空判断 start/end。
**修复**: 明确 `tool_call` 的 output 始终为 `''`(修复了可能传递 data.output 的问题),添加清晰注释说明 start/end 语义约定。
---
## 二、根因分析
工具调用问题最常见的故障模式:
1. **LLM 返回的 tool_call 参数格式错误** → OpenAI 驱动静默替换为 `{}` → 工具以空参数执行 → 结果不符合预期
2. **工具执行异常** → after_tool_call 中间件未调用 → 错误未格式化 → LLM 收到原始错误信息无法恢复
3. **流被中断后重连** → DanglingToolMiddleware 修复悬挂 → 但如果修复逻辑本身有 bug如重复修补会导致消息膨胀
## 三、修复建议
### 修复 1: 在 loop_runner 中调用 after_tool_call
**优先级**: P0
**影响文件**: `loop_runner.rs`
在非流式模式的工具执行循环中(约第 530 行),工具执行后调用:
```rust
let after_result = middleware_chain.run_after_tool_call(
&name, &input_json, &output_str, &mut ctx
).await;
```
在流式模式的工具执行后(约第 1020 行),同样调用。
### 修复 2: 将 ToolErrorMiddleware 计数器改为 per-session
**优先级**: P2
**影响文件**: `middleware/tool_error.rs`
使用 `HashMap<String, u32>` 以 session_id 为 key 存储计数。
### 修复 3: ToolOutputGuard 改为精确匹配
**优先级**: P2
**影响文件**: `middleware/tool_output_guard.rs`
只在检测到独立的密钥值时触发(如 `sk-[48字符]`),而非单词级匹配。
---
## 四、关键文件
| 文件 | 作用 |
|------|------|
| `crates/zclaw-runtime/src/loop_runner.rs` | 主循环,工具调度 |
| `crates/zclaw-runtime/src/tool.rs` | ToolRegistry + Tool trait |
| `crates/zclaw-runtime/src/middleware/tool_error.rs` | 工具错误处理 |
| `crates/zclaw-runtime/src/middleware/tool_output_guard.rs` | 输出安全检查 |
| `crates/zclaw-runtime/src/middleware/dangling_tool.rs` | 断裂工具修复 |
| `crates/zclaw-runtime/src/driver/openai.rs` | OpenAI 兼容驱动 |
| `desktop/src/lib/gateway-client.ts` | 前端通信客户端 |
| `desktop/src/store/chat/streamStore.ts` | 前端流式处理 |

View File

@@ -1,6 +1,6 @@
--- ---
title: 聊天系统 title: 聊天系统
updated: 2026-04-22 updated: 2026-04-23
status: active status: active
tags: [module, chat, stream] tags: [module, chat, stream]
--- ---
@@ -17,6 +17,7 @@ tags: [module, chat, stream]
| 5 Store 拆分 | 原 908 行 ChatStore → stream/conversation/message/chat/artifact单一职责 | | 5 Store 拆分 | 原 908 行 ChatStore → stream/conversation/message/chat/artifact单一职责 |
| 5 分钟超时守护 | 防止流挂起: kernel-chat.ts:76超时自动 cancelStream | | 5 分钟超时守护 | 防止流挂起: kernel-chat.ts:76超时自动 cancelStream |
| 统一回调接口 | 3 种实现共享 `{ onDelta, onThinkingDelta, onTool, onHand, onComplete, onError }` | | 统一回调接口 | 3 种实现共享 `{ onDelta, onThinkingDelta, onTool, onHand, onComplete, onError }` |
| LLM 动态建议 | 替换硬编码关键词匹配,用 LLM 生成个性化建议1深入追问+1实用行动+1管家关怀4路并行预取智能上下文 |
### ChatStream 实现 ### ChatStream 实现
@@ -33,11 +34,14 @@ tags: [module, chat, stream]
| 文件 | 职责 | | 文件 | 职责 |
|------|------| |------|------|
| `desktop/src/store/chat/streamStore.ts` | 流式消息编排、发送、取消 | | `desktop/src/store/chat/streamStore.ts` | 流式消息编排、发送、取消、LLM 动态建议生成 |
| `desktop/src/store/chat/conversationStore.ts` | 会话管理、当前模型、sessionKey | | `desktop/src/store/chat/conversationStore.ts` | 会话管理、当前模型、sessionKey |
| `desktop/src/store/chat/messageStore.ts` | 消息持久化 (IndexedDB) | | `desktop/src/store/chat/messageStore.ts` | 消息持久化 (IndexedDB) |
| `desktop/src/lib/kernel-chat.ts` | KernelClient ChatStream (Tauri) | | `desktop/src/lib/kernel-chat.ts` | KernelClient ChatStream (Tauri) |
| `desktop/src/lib/suggestion-context.ts` | 4路并行智能上下文拉取 (用户画像/痛点/经验/技能匹配) |
| `desktop/src/lib/cold-start-mapper.ts` | 冷启动配置映射 (行业检测/命名/个性/技能) |
| `desktop/src/components/ChatArea.tsx` | 聊天区域 UI | | `desktop/src/components/ChatArea.tsx` | 聊天区域 UI |
| `desktop/src/components/ai/SuggestionChips.tsx` | 动态建议芯片展示 |
| `crates/zclaw-runtime/src/loop_runner.rs` | Rust 主聊天循环 + 中间件链 | | `crates/zclaw-runtime/src/loop_runner.rs` | Rust 主聊天循环 + 中间件链 |
### 发送消息流 ### 发送消息流
@@ -100,6 +104,20 @@ UI 选择模型 → conversationStore.currentModel = newModel
- cancelStream 设置原子标志位,与 onDelta 回调无竞态 - cancelStream 设置原子标志位,与 onDelta 回调无竞态
- 3 种 ChatStream 共享同一套回调接口,上层代码无需感知实现差异 - 3 种 ChatStream 共享同一套回调接口,上层代码无需感知实现差异
- 消息持久化走 messageStore → IndexedDB与流式渲染解耦 - 消息持久化走 messageStore → IndexedDB与流式渲染解耦
- 动态建议 4 路并行预取 (userProfile/painPoints/experiences/skillMatch)500ms 超时降级为空串
- 建议生成与 memory extraction 解耦 — 不等 memory LLM 调用完成即启动建议
### LLM 动态建议
```
sendMessage → isStreaming=true + _activeSuggestionContextPrefetch = fetchSuggestionContext(...)
→ 流式响应中 prefetch 在后台执行
onComplete → createCompleteHandler
→ generateLLMSuggestions(prefetchedContext) — 立即启动不等 memory
→ prompt: 1 深入追问 + 1 实用行动 + 1 管家关怀
→ memory/reflection 后台独立运行 (Promise.all)
→ SuggestionChips 渲染
```
### Tauri 命令 ### Tauri 命令
@@ -114,6 +132,8 @@ UI 选择模型 → conversationStore.currentModel = newModel
| 问题 | 状态 | 说明 | | 问题 | 状态 | 说明 |
|------|------|------| |------|------|------|
| after_tool_call 中间件未调用 | ✅ 已修复 (04-24) | 流式+非流式均添加调用ToolErrorMiddleware/ToolOutputGuard 现在生效 |
| stream_errored 跳过所有工具 | ✅ 已修复 (04-24) | 完整工具照常执行,不完整工具发送取消事件 |
| B-CHAT-07 混合域截断 | P2 Open | 跨域消息时可能截断上下文 | | B-CHAT-07 混合域截断 | P2 Open | 跨域消息时可能截断上下文 |
| SSE Token 统计为 0 | ✅ 已修复 | SseUsageCapture stream_done flag | | SSE Token 统计为 0 | ✅ 已修复 | SseUsageCapture stream_done flag |
| Tauri invoke 参数名 | ✅ 已修复 (f6c5dd2) | camelCase 格式 | | Tauri invoke 参数名 | ✅ 已修复 (f6c5dd2) | camelCase 格式 |
@@ -122,14 +142,15 @@ UI 选择模型 → conversationStore.currentModel = newModel
**注意事项:** **注意事项:**
- 辅助 LLM 调用 (记忆摘要/提取、管家路由) 复用 `kernel_init` 的 model+base_url与聊天同链路 - 辅助 LLM 调用 (记忆摘要/提取、管家路由) 复用 `kernel_init` 的 model+base_url与聊天同链路
- 课堂聊天是独立 Tauri 命令 (`classroom_chat`),不走 `agent_chat_stream` - 课堂聊天是独立 Tauri 命令 (`classroom_chat`),不走 `agent_chat_stream`
- Agent tab 已移除 — 跨会话身份由 soul.md 接管,不再通过 RightPanel 管理
## 5. 变更日志 ## 5. 变更日志
| 日期 | 变更 | | 日期 | 变更 |
|------|------| |------|------|
| 04-24 | 工具调用 P0 修复: after_tool_call 中间件接入(流式+非流式) + stream_errored 工具抢救(完整工具执行+不完整工具取消) |
| 04-24 | 产物系统优化: MarkdownRenderer 提取共享 + ArtifactPanel react-markdown 渲染 + 文件选择器下拉 + 数据源扩展(file_write/str_replace 两路径) + artifactStore IndexedDB 持久化 |
| 04-23 | 建议 prefetch: sendMessage 时启动 context 预取,流结束后立即消费,不等 memory extraction |
| 04-23 | 建议 prompt 重写: 1深入追问+1实用行动+1管家关怀上下文窗口 6→20 条 |
| 04-23 | 身份信号: detectAgentNameSuggestion 前端即时检测 + RightPanel 监听 Tauri 事件刷新名称 | | 04-23 | 身份信号: detectAgentNameSuggestion 前端即时检测 + RightPanel 监听 Tauri 事件刷新名称 |
| 04-22 | Wiki 重写: 5 节模板,增加集成契约和不变量 | | 04-23 | Agent tab 移除: RightPanel 清理 ~280 行 dead code身份由 soul.md 接管 |
| 04-21 | 上一轮更新 |
| 04-17 | ChatStore 拆分为 5 Store (stream/conversation/message/chat/artifact) |
| 04-16 | Provider Key 解密修复 (b69dc61) |
| 04-16 | Tauri invoke 参数名修复 (f6c5dd2) |

View File

@@ -133,6 +133,18 @@ skills/ -> SkillRegistry 加载 -> SkillIndexMiddleware@200 注入系统提示
- MCP 限定名 `service_name.tool_name` 避免与内置工具冲突 - MCP 限定名 `service_name.tool_name` 避免与内置工具冲突
- 已删除空壳 Hands (04-17): Whiteboard/Slideshow/Speech净减 ~5400 行 - 已删除空壳 Hands (04-17): Whiteboard/Slideshow/Speech净减 ~5400 行
### ⚡ 新增工具/技能必须声明 concurrency 级别
`Tool` trait 的 `concurrency()` 方法决定并行执行策略 (04-24 Hermes Phase 2A):
| 级别 | 含义 | 适用场景 |
|------|------|---------|
| `ReadOnly` (默认) | 只读,始终可并行 | file_read, web_search, calculator |
| `Exclusive` | 有副作用,必须串行 | file_write, shell_exec, send_message, execute_skill, task |
| `Interactive` | 需要用户交互,永不并行 | ask_clarification |
**新增工具时**:在 `impl Tool for YourTool` 中覆盖 `concurrency()` 方法。默认 `ReadOnly`,如果有写操作/副作用必须返回 `ToolConcurrency::Exclusive`。未正确声明会导致并行执行时产生竞态条件。
## 4. 活跃问题 + 陷阱 ## 4. 活跃问题 + 陷阱
### 活跃 ### 活跃
@@ -155,6 +167,7 @@ skills/ -> SkillRegistry 加载 -> SkillIndexMiddleware@200 注入系统提示
| 日期 | 变更 | 关联 | | 日期 | 变更 | 关联 |
|------|------|------| |------|------|------|
| 2026-04-24 | Hermes Phase 2A: ToolConcurrency 枚举 + 并行执行 + concurrency() 声明要求 | commit 9060935 |
| 2026-04-22 | Wiki 5-section 重构: 281->~195 行,语义路由细节引用 [[butler]] | wiki/ | | 2026-04-22 | Wiki 5-section 重构: 281->~195 行,语义路由细节引用 [[butler]] | wiki/ |
| 2026-04-22 | Researcher 搜索修复: schema 扁平化 + 空参数回退 + 排版修复 | commit 5816f56+81005c3 | | 2026-04-22 | Researcher 搜索修复: schema 扁平化 + 空参数回退 + 排版修复 | commit 5816f56+81005c3 |
| 2026-04-17 | 空壳 Hand 清理: Whiteboard/Slideshow/Speech 删除,净减 ~5400 行 | Phase 5 清理 | | 2026-04-17 | 空壳 Hand 清理: Whiteboard/Slideshow/Speech 删除,净减 ~5400 行 | Phase 5 清理 |

View File

@@ -1,6 +1,6 @@
--- ---
title: ZCLAW 项目知识库 title: ZCLAW 项目知识库
updated: 2026-04-22 updated: 2026-04-24
status: active status: active
--- ---
@@ -8,29 +8,29 @@ status: active
> 面向中文用户的 AI Agent 桌面客户端。管家模式 + 多模型 + 7 自主能力 + 75 技能。 > 面向中文用户的 AI Agent 桌面客户端。管家模式 + 多模型 + 7 自主能力 + 75 技能。
> **使用方式**: 找到你要处理的模块,读对应页面,直接开始工作。 > **使用方式**: 找到你要处理的模块,读对应页面,直接开始工作。
> **数据来源**: 2026-04-22 代码全量扫描验证,非文档推测。 > **数据来源**: 2026-04-23 代码全量扫描验证,非文档推测。
## 项目画像 ## 项目画像
| 维度 | 值 | | 维度 | 值 |
|------|-----| |------|-----|
| 定位 | AI Agent 桌面客户端 (Tauri 2.x) | | 定位 | AI Agent 桌面客户端 (Tauri 2.x) |
| 技术栈 | Rust 10 crates + src-tauri (~102K行, 357 .rs) + React 19 + TypeScript + PostgreSQL | | 技术栈 | Rust 10 crates + src-tauri (~148K行, 384 .rs) + React 19 + TypeScript + PostgreSQL |
| 阶段 | 发布前稳定化,功能冻结中 | | 阶段 | 发布前稳定化,功能冻结中 |
## 关键数字2026-04-22 代码验证) ## 关键数字2026-04-23 代码验证)
| 指标 | 值 | | 指标 | 值 |
|------|-----| |------|-----|
| Rust Crates | 10 + src-tauri | | Rust Crates | 10 + src-tauri |
| Rust 代码 | 101,967 行 (357 .rs文件) | | Rust 代码 | 148,185 行 (384 .rs文件) |
| Rust 测试 | 987 定义 / 797 通过 | | Rust 测试 | 997 定义 (619 #[test] + 378 #[tokio::test]) |
| Tauri 命令 | 190 定义 / 97 @reserved / 104 invoke | | Tauri 命令 | 193 定义 / 104 invoke |
| SaaS API | 137 .route() / 16 模块 / 38 SQL 迁移 / 42 表 | | SaaS API | 137 .route() / 16 模块 / 38 SQL 迁移 / 42 表 |
| 中间件 | 14 层 runtime + 10 层 SaaS HTTP | | 中间件 | 14 层 runtime + 10 层 SaaS HTTP |
| SKILL / HAND | 75 技能目录 / 7 注册 Hand (6 TOML + _reminder) | | SKILL / HAND | 75 技能目录 / 7 注册 Hand (6 TOML + _reminder) |
| Pipeline | 18 YAML 模板 (8 目录) | | Pipeline | 18 YAML 模板 (8 目录) |
| 前端 | 25 Store / 102 组件 / 75 lib / 17 Admin 页面 | | 前端 | 25 Store / 103 组件 / 78 lib / 17 Admin 页面 |
| Intelligence | 16 .rs 文件 | | Intelligence | 16 .rs 文件 |
| 质量指标 | 0 cargo warnings / 2 TODO/FIXME / 0 dead_code | | 质量指标 | 0 cargo warnings / 2 TODO/FIXME / 0 dead_code |
@@ -38,13 +38,13 @@ status: active
| 类别 | 功能 | 入口 | Wiki | | 类别 | 功能 | 入口 | Wiki |
|------|------|------|------| |------|------|------|------|
| 对话 | 发消息、流式响应、多模型切换 | 聊天面板 | [[chat]] | | 对话 | 发消息、流式响应、多模型切换、LLM 动态建议 | 聊天面板 | [[chat]] |
| 分身 | 创建/切换/配置 Agent | 侧边栏 Agent 列表 | [[chat]] | | 分身 | 创建/切换/配置 Agent、跨会话身份记忆 (soul.md) | 侧边栏 Agent 列表 | [[chat]] |
| 自主 | 触发 Browser/Collector/Twitter 等 | 自动化面板 | [[hands-skills]] | | 自主 | 触发 Browser/Collector/Twitter 等 | 自动化面板 | [[hands-skills]] |
| 记忆 | 搜索历史、自动注入上下文 | 设置 > 语义记忆 | [[memory]] | | 记忆 | 搜索历史、自动注入上下文、身份信号提取 | 设置 > 语义记忆 | [[memory]] |
| 配置 | 模型/API/工作区/安全存储 | 设置面板 (19 页) | [[development]] | | 配置 | 模型/API/工作区/安全存储 | 设置面板 (19 页) | [[development]] |
| SaaS | 登录注册、订阅计费、Admin 管理 | SaaS 平台 / Admin 后台 | [[saas]] | | SaaS | 登录注册、订阅计费、Admin 管理 | SaaS 平台 / Admin 后台 | [[saas]] |
| 管家 | 痛点积累、行业配置、简洁/专业模式 | 聊天面板 (默认模式) | [[butler]] | | 管家 | 痛点积累、行业配置、简洁/专业模式、跨会话身份、动态建议 | 聊天面板 (默认模式) | [[butler]] |
| Pipeline | YAML 模板选择、配置、DAG 执行 | 工作流面板 | [[pipeline]] | | Pipeline | YAML 模板选择、配置、DAG 执行 | 工作流面板 | [[pipeline]] |
| 安全 | JWT 认证、TOTP 2FA、操作审计 | 设置 > 安全存储 | [[security]] | | 安全 | JWT 认证、TOTP 2FA、操作审计 | 设置 > 安全存储 | [[security]] |
| 数据 | PostgreSQL (42表) + SQLite/FTS5 (本地记忆) | — | [[data-model]] | | 数据 | PostgreSQL (42表) + SQLite/FTS5 (本地记忆) | — | [[data-model]] |
@@ -97,5 +97,7 @@ ZCLAW
| Agent 创建失败 | [[chat]] | [[saas]] | 权限或持久化问题 | | Agent 创建失败 | [[chat]] | [[saas]] | 权限或持久化问题 |
| Pipeline 执行卡住 | [[pipeline]] | [[middleware]] | DAG 循环 / 依赖缺失 | | Pipeline 执行卡住 | [[pipeline]] | [[middleware]] | DAG 循环 / 依赖缺失 |
| Admin 页面 403 | [[saas]] | [[security]] | JWT 过期 / admin_guard 拦截 | | Admin 页面 403 | [[saas]] | [[security]] | JWT 过期 / admin_guard 拦截 |
| Agent 名字不记住 | [[butler]] | [[memory]] | soul.md 写入失败 / identity signal 未提取 |
| 建议不个性化 | [[chat]] | [[butler]] | 4路上下文超时 / ExperienceExtractor 未初始化 |
> 数字真相源: `docs/TRUTH.md` — 如有冲突以代码实际为准 > 数字真相源: `docs/TRUTH.md` — 如有冲突以代码实际为准

View File

@@ -1,6 +1,6 @@
--- ---
title: 变更日志 title: 变更日志
updated: 2026-04-22 updated: 2026-04-24
status: active status: active
tags: [log, history] tags: [log, history]
--- ---
@@ -9,10 +9,55 @@ tags: [log, history]
> Append-only 操作记录。格式: `## [日期] 类型 | 描述` > Append-only 操作记录。格式: `## [日期] 类型 | 描述`
## [2026-04-24] fix(runtime+middleware) | 工具调用 P1/P2/P3 全面修复
- **P1 流式工具并行**: 三阶段执行 (中间件预检→并行+串行分区→结果排序)ReadOnly 工具 JoinSet+Semaphore(3)
- **P2 OpenAI 驱动**: 参数解析失败不再静默替换为 `{}`,改为返回 `_parse_error`+`_raw_args` 让 LLM 自我修正
- **P2 ToolOutputGuard**: 从关键词匹配改为 regex 精确匹配实际密钥值 (sk-xxx/AKIA/PEM 等),消除误拦
- **P2 ToolErrorMiddleware**: 失败计数器从全局 AtomicU32 改为 per-session HashMap消除跨会话误触发
- **P3 Gateway client**: 明确 tool_call/tool_result 的 onTool 回调语义约定 (output='' 为 start, input='' 为 end)
- **测试**: 91 tests PASS, tsc --noEmit PASS
## [2026-04-24] fix(runtime) | 工具调用两个 P0 修复
- **P0: after_tool_call 中间件从未调用**: 流式+非流式模式均添加 `middleware_chain.run_after_tool_call()` 调用ToolErrorMiddleware 和 ToolOutputGuardMiddleware 的 after 逻辑现在生效
- **P0: stream_errored 跳过所有工具**: 流式模式中 `stream_errored` 不再 `break 'outer`改为区分完整工具ToolUseEnd 已接收)和不完整工具;完整工具照常执行,不完整工具发送取消 ToolEnd 事件
- **影响文件**: `loop_runner.rs`
- **测试**: 91 tests PASS, 0 cargo warnings
## [2026-04-24] feat(artifact) | 产物系统优化完善
- **MarkdownRenderer**: 从 StreamingText 提取共享 Markdown 渲染组件react-markdown + remark-gfmArtifactPanel 复用
- **ArtifactPanel**: 替换手写 30 行 MarkdownPreview → 完整 GFM 渲染(表格/代码块/列表/引用);添加文件选择器下拉菜单
- **数据源扩展**: 产物创建从 file_write 单工具 → file_write/str_replace/write_file/str_replace_editor从 sendMessage 单路径 → sendMessage + initStreamListener 双路径
- **持久化**: artifactStore 添加 zustand persist + IndexedDB (复用 idb-storage),刷新后产物保留
- **验证**: tsc --noEmit PASS, 343 vitest PASS
## [2026-04-24] perf | Hermes 高价值设计实施 Phase 1-4
- **Phase 1**: Anthropic prompt caching — cache_control ephemeral + cache token tracking (CompletionResponse + StreamChunk)
- **Phase 2A**: 并行工具执行 — ToolConcurrency 枚举 (ReadOnly/Exclusive/Interactive) + JoinSet + Semaphore(3) + AtomicU32
- **Phase 2B**: 工具输出修剪 — prune_tool_outputs() (2000→500 chars) + 集成到 CompactionMiddleware
- **Phase 3**: 错误分类+智能重试 — LlmErrorKind + ClassifiedLlmError + RetryDriver (jittered backoff) + CONTEXT_OVERFLOW recovery
- **Phase 4**: 异步压缩+迭代摘要 — 30s 防抖 + cached fallback + previous_summary 迭代累积
- **新增文件**: error_classifier.rs, retry_driver.rs
- **验证**: 997 workspace tests PASS
## [2026-04-23] perf | 回复效率+建议生成并行化优化 (三部分)
- **perf(src-tauri)**: identity prompt 缓存 (`LazyLock<RwLock<HashMap>>`) + `pre_conversation_hook` 并行化 (`tokio::join!`)
- **perf(runtime)**: middleware `before_completion` 分波并行 — `parallel_safe()` trait + wave detection + `tokio::spawn`5 层 safe 中间件可并行
- **perf(desktop)**: suggestion context 预取 (sendMessage 时启动) + generateLLMSuggestions 与 memory extraction 解耦
- **feat(desktop)**: suggestion prompt 重写 (1深入追问+1实用行动+1管家关怀) + 上下文窗口 6→20 条
- **文件**: intelligence_hooks.rs, middleware.rs, 5 个 middleware 子模块, streamStore.ts, llm-service.ts
- **验证**: cargo test --workspace --exclude zclaw-saas 0 fail, tsc --noEmit 0 error
## [2026-04-23] fix | Agent 命名检测重构+跨会话记忆修复+Agent tab 移除 ## [2026-04-23] fix | Agent 命名检测重构+跨会话记忆修复+Agent tab 移除
- **fix(desktop)**: `detectAgentNameSuggestion` 从 6 个固定正则改为 trigger+extract 两步法 (10 个 trigger) - **fix(desktop)**: `detectAgentNameSuggestion` 从 6 个固定正则改为 trigger+extract 两步法 (10 个 trigger)
- **fix(desktop)**: 名字检测从 memory extraction 解耦 — 502 不再阻断面板刷新 - **fix(desktop)**: 名字检测从 memory extraction 解耦 — 502 不再阻断面板刷新
- **fix(src-tauri)**: `agent_update` 同步写入 soul.md — config.name → system prompt 断链修复 - **fix(src-tauri)**: `agent_update` 同步写入 soul.md — config.name → system prompt 断链修复
## [2026-04-23] feat | 动态建议智能化
- **feat(src-tauri)**: 新增 `experience_find_relevant` Tauri 命令 + `ExperienceBrief` 结构 + OnceLock 单例
- **feat(desktop)**: 新增 `suggestion-context.ts` — 4 路并行拉取智能上下文(用户画像/痛点/经验/技能匹配)
- **feat(desktop)**: `streamStore.ts` createCompleteHandler 并行化 + generateLLMSuggestions 增强
- **feat(desktop)**: suggestion prompt 改为混合型2 续问 + 1 管家关怀)
- **文件**: experience.rs, lib.rs, suggestion-context.ts, streamStore.ts, llm-service.ts
- **refactor(desktop)**: 移除 Agent tab (简洁模式/专业模式),清理 dead code (~280 行) - **refactor(desktop)**: 移除 Agent tab (简洁模式/专业模式),清理 dead code (~280 行)
- **验证**: cargo check 0 error, tsc --noEmit 0 error - **验证**: cargo check 0 error, tsc --noEmit 0 error

View File

@@ -1,6 +1,6 @@
--- ---
title: 中间件链 title: 中间件链
updated: 2026-04-22 updated: 2026-04-23
status: active status: active
tags: [module, middleware, runtime] tags: [module, middleware, runtime]
--- ---
@@ -17,6 +17,7 @@ tags: [module, middleware, runtime]
- **WHY 注册顺序 != 执行顺序**: `kernel/mod.rs` 中 14 次 `chain.register()` 的代码顺序与运行时顺序无关chain 按 `priority()` 升序排列后执行。 - **WHY 注册顺序 != 执行顺序**: `kernel/mod.rs` 中 14 次 `chain.register()` 的代码顺序与运行时顺序无关chain 按 `priority()` 升序排列后执行。
- **WHY 6 类 14 层**: 进化(70-79) -> 路由(80-99) -> 上下文(100-199) -> 能力(200-399) -> 安全(400-599) -> 遥测(600-799),优先级范围即执行阶段。 - **WHY 6 类 14 层**: 进化(70-79) -> 路由(80-99) -> 上下文(100-199) -> 能力(200-399) -> 安全(400-599) -> 遥测(600-799),优先级范围即执行阶段。
- **WHY Stop/Block/AbortLoop**: 细粒度流控 -- Stop 中断 LLM 循环Block 阻止单次工具调用AbortLoop 终止整个 Agent 循环。命中后跳过所有后续中间件。 - **WHY Stop/Block/AbortLoop**: 细粒度流控 -- Stop 中断 LLM 循环Block 阻止单次工具调用AbortLoop 终止整个 Agent 循环。命中后跳过所有后续中间件。
- **WHY 分波并行 (parallel_safe)**: `before_completion` 阶段,只修改 `system_prompt` 的中间件可声明 `parallel_safe() == true`,连续的 parallel-safe 中间件通过 `tokio::spawn` 并行执行,各自持有 `MiddlewareContext` clone完成后合并 prompt 贡献。降低串行延迟 ~1-3s。
## 2. 关键文件 + 数据流 ## 2. 关键文件 + 数据流
@@ -34,7 +35,9 @@ tags: [module, middleware, runtime]
``` ```
用户消息 -> AgentLoop 用户消息 -> AgentLoop
-> chain.run_before_completion(ctx) -> chain.run_before_completion(ctx)
-> [按 priority 升序] 每层 middleware.before_completion() -> [分波并行] 检测连续 parallel_safe 中间件
-> Wave 并行 (2+ safe): tokio::spawn 各自 ctx.clone() → 合并 prompt
-> 串行 (unsafe / 单个 safe): 逐个执行
-> Continue: 下一层 | Stop(reason): 中断循环 -> Continue: 下一层 | Stop(reason): 中断循环
-> LLM 调用 -> LLM 调用
-> (工具调用时) chain.run_before_tool_call() -> (工具调用时) chain.run_before_tool_call()
@@ -57,22 +60,22 @@ tags: [module, middleware, runtime]
### 14 层 Runtime 中间件 ### 14 层 Runtime 中间件
| 优先级 | 中间件 | 文件 | 职责 | 注册条件 | | 优先级 | 中间件 | 文件 | 职责 | parallel_safe | 注册条件 |
|--------|--------|------|------|----------| |--------|--------|------|------|---------------|----------|
| @78 | EvolutionMiddleware | `evolution.rs` | 推送进化候选项到 system prompt | 始终 | | @78 | EvolutionMiddleware | `evolution.rs` | 推送进化候选项到 system prompt | ✅ | 始终 |
| @80 | ButlerRouter | `butler_router.rs` | 语义技能路由 + system prompt 增强 + XML fencing | 始终 | | @80 | ButlerRouter | `butler_router.rs` | 语义技能路由 + system prompt 增强 + XML fencing | ✅ | 始终 |
| @100 | Compaction | `compaction.rs` | 超阈值时压缩对话历史 | `compaction_threshold > 0` | | @100 | Compaction | `compaction.rs` | 超阈值时压缩对话历史 | ❌ | `compaction_threshold > 0` |
| @150 | Memory | `memory.rs` | 对话后自动提取记忆 + 注入检索结果 | 始终 | | @150 | Memory | `memory.rs` | 对话后自动提取记忆 + 注入检索结果 | ✅ | 始终 |
| @180 | Title | `title.rs` | 自动生成会话标题 | 始终 | | @180 | Title | `title.rs` | 自动生成会话标题 | ✅ | 始终 |
| @200 | SkillIndex | `skill_index.rs` | 注入技能索引到 system prompt | `!skill_index.is_empty()` | | @200 | SkillIndex | `skill_index.rs` | 注入技能索引到 system prompt | ✅ | `!skill_index.is_empty()` |
| @300 | DanglingTool | `dangling_tool.rs` | 修复缺失的工具调用结果 | 始终 | | @300 | DanglingTool | `dangling_tool.rs` | 修复缺失的工具调用结果 | ❌ | 始终 |
| @350 | ToolError | `tool_error.rs` | 格式化工具错误供 LLM 恢复 | 始终 | | @350 | ToolError | `tool_error.rs` | 格式化工具错误供 LLM 恢复 | ❌ | 始终 |
| @360 | ToolOutputGuard | `tool_output_guard.rs` | 工具输出安全检查 | 始终 | | @360 | ToolOutputGuard | `tool_output_guard.rs` | 工具输出安全检查 | ❌ | 始终 |
| @400 | Guardrail | `guardrail.rs` | shell_exec/file_write/web_fetch 安全规则 | 始终 | | @400 | Guardrail | `guardrail.rs` | shell_exec/file_write/web_fetch 安全规则 | ❌ | 始终 |
| @500 | LoopGuard | `loop_guard.rs` | 防止工具调用无限循环 | 始终 | | @500 | LoopGuard | `loop_guard.rs` | 防止工具调用无限循环 | ❌ | 始终 |
| @550 | SubagentLimit | `subagent_limit.rs` | 限制并发子 agent | 始终 | | @550 | SubagentLimit | `subagent_limit.rs` | 限制并发子 agent | ❌ | 始终 |
| @650 | TrajectoryRecorder | `trajectory_recorder.rs` | 轨迹记录 + 压缩 | 始终 | | @650 | TrajectoryRecorder | `trajectory_recorder.rs` | 轨迹记录 + 压缩 | ❌ | 始终 |
| @700 | TokenCalibration | `token_calibration.rs` | Token 用量校准 | 始终 | | @700 | TokenCalibration | `token_calibration.rs` | Token 用量校准 | ❌ | 始终 |
> 注册顺序 (代码) 与执行顺序 (priority) 不同。Chain 按 priority 升序排列后执行。 > 注册顺序 (代码) 与执行顺序 (priority) 不同。Chain 按 priority 升序排列后执行。
@@ -96,6 +99,8 @@ tags: [module, middleware, runtime]
- Priority 升序: 0-999, 数值越小越先执行 - Priority 升序: 0-999, 数值越小越先执行
- 注册顺序 != 执行顺序; chain 按 priority 运行时排序 - 注册顺序 != 执行顺序; chain 按 priority 运行时排序
- Stop/Block/AbortLoop 立即中断, 不执行后续中间件 - Stop/Block/AbortLoop 立即中断, 不执行后续中间件
- parallel_safe 中间件只修改 system_prompt不修改 messages不返回 Stop
- 分波合并: 并行 wave 中每个中间件 clone context完成后按 base_prompt_len 截取增量合并
### 核心接口 ### 核心接口
@@ -103,6 +108,7 @@ tags: [module, middleware, runtime]
trait AgentMiddleware: Send + Sync { trait AgentMiddleware: Send + Sync {
fn name(&self) -> &str; fn name(&self) -> &str;
fn priority(&self) -> i32 { 500 } fn priority(&self) -> i32 { 500 }
fn parallel_safe(&self) -> bool { false }
async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision>; async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision>;
async fn before_tool_call(&self, ctx: &MiddlewareContext, tool_name: &str, tool_input: &Value) -> Result<ToolCallDecision>; async fn before_tool_call(&self, ctx: &MiddlewareContext, tool_name: &str, tool_input: &Value) -> Result<ToolCallDecision>;
async fn after_tool_call(&self, ctx: &mut MiddlewareContext, tool_name: &str, result: &Value) -> Result<()>; async fn after_tool_call(&self, ctx: &mut MiddlewareContext, tool_name: &str, result: &Value) -> Result<()>;
@@ -129,8 +135,8 @@ trait AgentMiddleware: Send + Sync {
| 日期 | 变更 | 影响 | | 日期 | 变更 | 影响 |
|------|------|------| |------|------|------|
| 04-23 | 分波并行执行: parallel_safe() + wave detection + tokio::spawn | before_completion 阶段 5 层 safe 中间件可并行,延迟降低 ~1-3s |
| 04-22 | DataMasking 中间件移除 | 14->14 层 (替换为无), 减少 1 层无收益处理 | | 04-22 | DataMasking 中间件移除 | 14->14 层 (替换为无), 减少 1 层无收益处理 |
| 04-22 | 跨会话记忆修复 | Memory 中间件去重+跨会话注入修复 | | 04-22 | 跨会话记忆修复 | Memory 中间件去重+跨会话注入修复 |
| 04-22 | Wiki 一致性校准 | 数字与代码验证对齐 | | 04-22 | Wiki 一致性校准 | 数字与代码验证对齐 |
| 04-21 | Embedding 接通 | SkillIndex 路由 TF-IDF->Embedding+LLM fallback | | 04-21 | Embedding 接通 | SkillIndex 路由 TF-IDF->Embedding+LLM fallback |
| 04-15 | Heartbeat 统一健康系统 | TrajectoryRecorder 痛点感知增强 |