docs(wiki): 回复效率+建议并行化优化 wiki 同步

- middleware.md: 分波并行执行设计决策 + parallel_safe 标注 + 不变量 + 执行流 - chat.md: suggestion prefetch + 解耦 memory + prompt 重写 - log.md: 追加变更记录 - CLAUDE.md: §13 架构快照 + 最近变更
2026-04-23 23:45:28 +08:00
parent ee5611a2f8
commit 9a313e3c92
4 changed files with 196 additions and 66 deletions
--- a/wiki/chat.md
+++ b/wiki/chat.md
@@ -1,6 +1,6 @@
 ---
 title: 聊天系统
-updated: 2026-04-22
+updated: 2026-04-23
 status: active
 tags: [module, chat, stream]
 ---
@@ -17,6 +17,7 @@ tags: [module, chat, stream]
 | 5 Store 拆分 | 原 908 行 ChatStore → stream/conversation/message/chat/artifact，单一职责 |
 | 5 分钟超时守护 | 防止流挂起: kernel-chat.ts:76，超时自动 cancelStream |
 | 统一回调接口 | 3 种实现共享 `{ onDelta, onThinkingDelta, onTool, onHand, onComplete, onError }` |
+| LLM 动态建议 | 替换硬编码关键词匹配，用 LLM 生成个性化建议（1深入追问+1实用行动+1管家关怀），4路并行预取智能上下文 |

 ### ChatStream 实现

@@ -33,11 +34,14 @@ tags: [module, chat, stream]

 | 文件 | 职责 |
 |------|------|
-| `desktop/src/store/chat/streamStore.ts` | 流式消息编排、发送、取消 |
+| `desktop/src/store/chat/streamStore.ts` | 流式消息编排、发送、取消、LLM 动态建议生成 |
 | `desktop/src/store/chat/conversationStore.ts` | 会话管理、当前模型、sessionKey |
 | `desktop/src/store/chat/messageStore.ts` | 消息持久化 (IndexedDB) |
 | `desktop/src/lib/kernel-chat.ts` | KernelClient ChatStream (Tauri) |
+| `desktop/src/lib/suggestion-context.ts` | 4路并行智能上下文拉取 (用户画像/痛点/经验/技能匹配) |
+| `desktop/src/lib/cold-start-mapper.ts` | 冷启动配置映射 (行业检测/命名/个性/技能) |
 | `desktop/src/components/ChatArea.tsx` | 聊天区域 UI |
+| `desktop/src/components/ai/SuggestionChips.tsx` | 动态建议芯片展示 |
 | `crates/zclaw-runtime/src/loop_runner.rs` | Rust 主聊天循环 + 中间件链 |

 ### 发送消息流
@@ -100,6 +104,20 @@ UI 选择模型 → conversationStore.currentModel = newModel
 - cancelStream 设置原子标志位，与 onDelta 回调无竞态
 - 3 种 ChatStream 共享同一套回调接口，上层代码无需感知实现差异
 - 消息持久化走 messageStore → IndexedDB，与流式渲染解耦
+- 动态建议 4 路并行预取 (userProfile/painPoints/experiences/skillMatch)，500ms 超时降级为空串
+- 建议生成与 memory extraction 解耦 — 不等 memory LLM 调用完成即启动建议
+
+### LLM 动态建议
+
+```
+sendMessage → isStreaming=true + _activeSuggestionContextPrefetch = fetchSuggestionContext(...)
+  → 流式响应中 prefetch 在后台执行
+onComplete → createCompleteHandler
+  → generateLLMSuggestions(prefetchedContext) — 立即启动不等 memory
+    → prompt: 1 深入追问 + 1 实用行动 + 1 管家关怀
+  → memory/reflection 后台独立运行 (Promise.all)
+  → SuggestionChips 渲染
+```

 ### Tauri 命令

@@ -122,14 +140,16 @@ UI 选择模型 → conversationStore.currentModel = newModel
 **注意事项:**
 - 辅助 LLM 调用 (记忆摘要/提取、管家路由) 复用 `kernel_init` 的 model+base_url，与聊天同链路
 - 课堂聊天是独立 Tauri 命令 (`classroom_chat`)，不走 `agent_chat_stream`
+- Agent tab 已移除 — 跨会话身份由 soul.md 接管，不再通过 RightPanel 管理

 ## 5. 变更日志

 | 日期 | 变更 |
 |------|------|
+| 04-23 | 建议 prefetch: sendMessage 时启动 context 预取，流结束后立即消费，不等 memory extraction |
+| 04-23 | 建议 prompt 重写: 1深入追问+1实用行动+1管家关怀，上下文窗口 6→20 条 |
 | 04-23 | 身份信号: detectAgentNameSuggestion 前端即时检测 + RightPanel 监听 Tauri 事件刷新名称 |
+| 04-23 | Agent tab 移除: RightPanel 清理 ~280 行 dead code，身份由 soul.md 接管 |
+| 04-23 | 澄清问题卡片 UX 优化: 去悬空引用 + 默认展开 |
 | 04-22 | Wiki 重写: 5 节模板，增加集成契约和不变量 |
 | 04-21 | 上一轮更新 |
-| 04-17 | ChatStore 拆分为 5 Store (stream/conversation/message/chat/artifact) |
-| 04-16 | Provider Key 解密修复 (b69dc61) |
-| 04-16 | Tauri invoke 参数名修复 (f6c5dd2) |
--- a/wiki/log.md
+++ b/wiki/log.md
@@ -9,6 +9,14 @@ tags: [log, history]

 > Append-only 操作记录。格式: `## [日期] 类型 | 描述`

+## [2026-04-23] perf | 回复效率+建议生成并行化优化 (三部分)
+- **perf(src-tauri)**: identity prompt 缓存 (`LazyLock<RwLock<HashMap>>`) + `pre_conversation_hook` 并行化 (`tokio::join!`)
+- **perf(runtime)**: middleware `before_completion` 分波并行 — `parallel_safe()` trait + wave detection + `tokio::spawn`，5 层 safe 中间件可并行
+- **perf(desktop)**: suggestion context 预取 (sendMessage 时启动) + generateLLMSuggestions 与 memory extraction 解耦
+- **feat(desktop)**: suggestion prompt 重写 (1深入追问+1实用行动+1管家关怀) + 上下文窗口 6→20 条
+- **文件**: intelligence_hooks.rs, middleware.rs, 5 个 middleware 子模块, streamStore.ts, llm-service.ts
+- **验证**: cargo test --workspace --exclude zclaw-saas 0 fail, tsc --noEmit 0 error
+
 ## [2026-04-23] fix | Agent 命名检测重构+跨会话记忆修复+Agent tab 移除
 - **fix(desktop)**: `detectAgentNameSuggestion` 从 6 个固定正则改为 trigger+extract 两步法 (10 个 trigger)
 - **fix(desktop)**: 名字检测从 memory extraction 解耦 — 502 不再阻断面板刷新
--- a/wiki/middleware.md
+++ b/wiki/middleware.md
@@ -1,6 +1,6 @@
 ---
 title: 中间件链
-updated: 2026-04-22
+updated: 2026-04-23
 status: active
 tags: [module, middleware, runtime]
 ---
@@ -17,6 +17,7 @@ tags: [module, middleware, runtime]
 - **WHY 注册顺序 != 执行顺序**: `kernel/mod.rs` 中 14 次 `chain.register()` 的代码顺序与运行时顺序无关，chain 按 `priority()` 升序排列后执行。
 - **WHY 6 类 14 层**: 进化(70-79) -> 路由(80-99) -> 上下文(100-199) -> 能力(200-399) -> 安全(400-599) -> 遥测(600-799)，优先级范围即执行阶段。
 - **WHY Stop/Block/AbortLoop**: 细粒度流控 -- Stop 中断 LLM 循环，Block 阻止单次工具调用，AbortLoop 终止整个 Agent 循环。命中后跳过所有后续中间件。
+- **WHY 分波并行 (parallel_safe)**: `before_completion` 阶段，只修改 `system_prompt` 的中间件可声明 `parallel_safe() == true`，连续的 parallel-safe 中间件通过 `tokio::spawn` 并行执行，各自持有 `MiddlewareContext` clone，完成后合并 prompt 贡献。降低串行延迟 ~1-3s。

 ## 2. 关键文件 + 数据流

@@ -34,8 +35,10 @@ tags: [module, middleware, runtime]
 ```
 用户消息 -> AgentLoop
  -> chain.run_before_completion(ctx)
-    -> [按 priority 升序] 每层 middleware.before_completion()
-      -> Continue: 下一层 | Stop(reason): 中断循环
+    -> [分波并行] 检测连续 parallel_safe 中间件
+      -> Wave 并行 (2+ safe): tokio::spawn 各自 ctx.clone() → 合并 prompt
+      -> 串行 (unsafe / 单个 safe): 逐个执行
+    -> Continue: 下一层 | Stop(reason): 中断循环
  -> LLM 调用
  -> (工具调用时) chain.run_before_tool_call()
    -> Allow | Block(msg) | ReplaceInput | AbortLoop
@@ -57,22 +60,22 @@ tags: [module, middleware, runtime]

 ### 14 层 Runtime 中间件

-| 优先级 | 中间件 | 文件 | 职责 | 注册条件 |
-|--------|--------|------|------|----------|
-| @78 | EvolutionMiddleware | `evolution.rs` | 推送进化候选项到 system prompt | 始终 |
-| @80 | ButlerRouter | `butler_router.rs` | 语义技能路由 + system prompt 增强 + XML fencing | 始终 |
-| @100 | Compaction | `compaction.rs` | 超阈值时压缩对话历史 | `compaction_threshold > 0` |
-| @150 | Memory | `memory.rs` | 对话后自动提取记忆 + 注入检索结果 | 始终 |
-| @180 | Title | `title.rs` | 自动生成会话标题 | 始终 |
-| @200 | SkillIndex | `skill_index.rs` | 注入技能索引到 system prompt | `!skill_index.is_empty()` |
-| @300 | DanglingTool | `dangling_tool.rs` | 修复缺失的工具调用结果 | 始终 |
-| @350 | ToolError | `tool_error.rs` | 格式化工具错误供 LLM 恢复 | 始终 |
-| @360 | ToolOutputGuard | `tool_output_guard.rs` | 工具输出安全检查 | 始终 |
-| @400 | Guardrail | `guardrail.rs` | shell_exec/file_write/web_fetch 安全规则 | 始终 |
-| @500 | LoopGuard | `loop_guard.rs` | 防止工具调用无限循环 | 始终 |
-| @550 | SubagentLimit | `subagent_limit.rs` | 限制并发子 agent | 始终 |
-| @650 | TrajectoryRecorder | `trajectory_recorder.rs` | 轨迹记录 + 压缩 | 始终 |
-| @700 | TokenCalibration | `token_calibration.rs` | Token 用量校准 | 始终 |
+| 优先级 | 中间件 | 文件 | 职责 | parallel_safe | 注册条件 |
+|--------|--------|------|------|---------------|----------|
+| @78 | EvolutionMiddleware | `evolution.rs` | 推送进化候选项到 system prompt | ✅ | 始终 |
+| @80 | ButlerRouter | `butler_router.rs` | 语义技能路由 + system prompt 增强 + XML fencing | ✅ | 始终 |
+| @100 | Compaction | `compaction.rs` | 超阈值时压缩对话历史 | ❌ | `compaction_threshold > 0` |
+| @150 | Memory | `memory.rs` | 对话后自动提取记忆 + 注入检索结果 | ✅ | 始终 |
+| @180 | Title | `title.rs` | 自动生成会话标题 | ✅ | 始终 |
+| @200 | SkillIndex | `skill_index.rs` | 注入技能索引到 system prompt | ✅ | `!skill_index.is_empty()` |
+| @300 | DanglingTool | `dangling_tool.rs` | 修复缺失的工具调用结果 | ❌ | 始终 |
+| @350 | ToolError | `tool_error.rs` | 格式化工具错误供 LLM 恢复 | ❌ | 始终 |
+| @360 | ToolOutputGuard | `tool_output_guard.rs` | 工具输出安全检查 | ❌ | 始终 |
+| @400 | Guardrail | `guardrail.rs` | shell_exec/file_write/web_fetch 安全规则 | ❌ | 始终 |
+| @500 | LoopGuard | `loop_guard.rs` | 防止工具调用无限循环 | ❌ | 始终 |
+| @550 | SubagentLimit | `subagent_limit.rs` | 限制并发子 agent | ❌ | 始终 |
+| @650 | TrajectoryRecorder | `trajectory_recorder.rs` | 轨迹记录 + 压缩 | ❌ | 始终 |
+| @700 | TokenCalibration | `token_calibration.rs` | Token 用量校准 | ❌ | 始终 |

 > 注册顺序 (代码) 与执行顺序 (priority) 不同。Chain 按 priority 升序排列后执行。

@@ -96,6 +99,8 @@ tags: [module, middleware, runtime]
 - Priority 升序: 0-999, 数值越小越先执行
 - 注册顺序 != 执行顺序; chain 按 priority 运行时排序
 - Stop/Block/AbortLoop 立即中断, 不执行后续中间件
+- parallel_safe 中间件只修改 system_prompt，不修改 messages，不返回 Stop
+- 分波合并: 并行 wave 中每个中间件 clone context，完成后按 base_prompt_len 截取增量合并

 ### 核心接口

@@ -103,6 +108,7 @@ tags: [module, middleware, runtime]
 trait AgentMiddleware: Send + Sync {
    fn name(&self) -> &str;
    fn priority(&self) -> i32 { 500 }
+    fn parallel_safe(&self) -> bool { false }
    async fn before_completion(&self, ctx: &mut MiddlewareContext) -> Result<MiddlewareDecision>;
    async fn before_tool_call(&self, ctx: &MiddlewareContext, tool_name: &str, tool_input: &Value) -> Result<ToolCallDecision>;
    async fn after_tool_call(&self, ctx: &mut MiddlewareContext, tool_name: &str, result: &Value) -> Result<()>;
@@ -129,8 +135,8 @@ trait AgentMiddleware: Send + Sync {

 | 日期 | 变更 | 影响 |
 |------|------|------|
+| 04-23 | 分波并行执行: parallel_safe() + wave detection + tokio::spawn | before_completion 阶段 5 层 safe 中间件可并行，延迟降低 ~1-3s |
 | 04-22 | DataMasking 中间件移除 | 14->14 层 (替换为无), 减少 1 层无收益处理 |
 | 04-22 | 跨会话记忆修复 | Memory 中间件去重+跨会话注入修复 |
 | 04-22 | Wiki 一致性校准 | 数字与代码验证对齐 |
 | 04-21 | Embedding 接通 | SkillIndex 路由 TF-IDF->Embedding+LLM fallback |
-| 04-15 | Heartbeat 统一健康系统 | TrajectoryRecorder 痛点感知增强 |