docs(ai): Spec Review R2 修复 — 复用 HealthDataProvider + 新增 generate_with_tools

CRITICAL: - 移除重复的 HealthDataQuery trait，扩展现有 HealthDataProvider（新增 2 方法） - Provider 适配改为新增 generate_with_tools 方法，不破坏现有 generate 路径 IMPORTANT: - 修复章节编号（全文重排为连续编号） - ai_tool_call_logs 补充 created_by + 说明省略原因（append-only） - ai_user_profiles 说明省略 created_by/updated_by 原因（Agent 自动维护） - ToolContext 改为持有 Arc<dyn HealthDataProvider> 而非裸 db - SSE 语义明确：仅流式输出最终回复 - 5 轮上限强制终止逻辑：追加总结指令让 LLM 正常结束 - GenerateRequest 不再破坏性修改，新旧路径并行
2026-05-18 01:57:16 +08:00
parent b0892706c8
commit 31771168dd
1 changed files with 122 additions and 110 deletions
--- a/docs/superpowers/specs/2026-05-18-ai-agent-breakthrough-design.md
+++ b/docs/superpowers/specs/2026-05-18-ai-agent-breakthrough-design.md
@@ -15,6 +15,7 @@ HMS 健康管理平台综合评分 6.8/10，功能完整度 87%，但 AI 能力"
 - 知识库框架（structured_source、KDIGO 规则）已搭建
 - 成本/配额管控（usage、quota、cache）已就绪
 - v2 架构设计已规划 RAG、事件驱动管线、两级缓存
+- `erp-core` 已有 `HealthDataProvider` trait（含 `get_lab_report`/`get_vital_signs`/`get_patient_summary`/`get_trend_analysis_data`），已注入 `AiState`

 **AI 客服"小华"现状**：一个硬编码 system prompt + 最近 10 条历史拼成上下文的简单问答，无法识别意图、无法查询数据、无法触发分析。

@@ -70,11 +71,12 @@ HMS 健康管理平台综合评分 6.8/10，功能完整度 87%，但 AI 能力"
 | 决策 | 选择 | 原因 |
 |------|------|------|
 | Agent 状态管理 | Orchestrator 无状态，会话由 Handler 管理 | 简化 Orchestrator 职责，便于测试 |
-| Tool 执行模型 | 同步阻塞，单轮内多个 Tool Call 并行 | LLM 返回多个 call 时并行执行，减少延迟 |
-| Provider 扩展 | 扩展 `GenerateRequest` 添加 tools/functions 字段，3 个 Provider 各自适配 | Function Calling 是核心能力，需中等程度重构每个 Provider 的请求/响应结构 |
-| 跨 crate 数据访问 | 在 erp-core 定义 `HealthDataQuery` trait，erp-health 实现，erp-ai 通过 trait 调用 | 保持模块边界，erp-ai 不直接依赖 erp-health |
-| 安全循环上限 | 单次对话最多 5 轮 Tool Call | 防止无限循环，控制成本 |
+| Tool 执行模型 | 同步阻塞，单轮内多个 Tool Call 并行（`futures::join_all`，单 Tool 超时 10s） | LLM 返回多个 call 时并行执行，减少延迟 |
+| Provider 扩展 | 在 `AiProvider` trait 新增 `generate_with_tools` 方法，保留原 `generate` 不变 | 不破坏现有分析端点调用，新旧路径并行 |
+| 跨 crate 数据访问 | 扩展现有 `HealthDataProvider` trait，新增 `get_appointments`/`get_medication` 方法 | erp-core 已有该 trait 且已注入 AiState，避免重复 |
+| 安全循环上限 | 单次对话最多 5 轮 Tool Call，达到上限时强制 LLM 生成最终回复 | 防止无限循环，控制成本 |
 | 分析调用模式 | Agent 内走非流式同步调用 | Agent 需要拿到完整结果再决策 |
+| SSE 语义 | SSE 仅流式输出 Agent 最终回复，Tool Call 过程不在 SSE 中传输 | 前端实现简单，用户体验清晰 |

 ---

@@ -95,7 +97,8 @@ pub struct ToolContext {
    pub tenant_id: Uuid,
    pub user_id: Uuid,
    pub patient_id: Option<Uuid>,
-    pub db: DatabaseConnection,
+    pub db: DatabaseConnection,                    // erp-ai 本地表（sessions/messages/logs）
+    pub health_provider: Arc<dyn HealthDataProvider>,  // erp-health 数据（已有注入）
 }

 pub struct ToolResult {
@@ -106,33 +109,25 @@ pub struct ToolResult {

 ### 3.2 跨 Crate 数据访问架构

-erp-ai 不直接依赖 erp-health（保持模块边界）。数据查询类 Tool 通过以下机制访问健康数据：
+erp-ai 不直接依赖 erp-health（保持模块边界）。数据查询类 Tool 通过已有的 `HealthDataProvider` trait 访问健康数据。

-**方案：在 erp-core 定义 `HealthDataQuery` trait，erp-health 实现**
+**现有 `HealthDataProvider` trait（erp-core，已注入 AiState）**：

 ```rust
-// erp-core 中定义
-#[async_trait]
-pub trait HealthDataQuery: Send + Sync {
-    async fn query_vitals(&self, tenant_id: Uuid, patient_id: Uuid, days: i32) -> Result<Vec<VitalSummary>>;
-    async fn query_lab_reports(&self, tenant_id: Uuid, patient_id: Uuid, limit: i32) -> Result<Vec<LabReportSummary>>;
-    async fn query_patient_profile(&self, tenant_id: Uuid, patient_id: Uuid) -> Result<PatientProfile>;
-    async fn query_appointments(&self, tenant_id: Uuid, patient_id: Uuid) -> Result<Vec<AppointmentSummary>>;
-    async fn query_medication(&self, tenant_id: Uuid, patient_id: Uuid) -> Result<Vec<MedicationSummary>>;
-}
+// 已有的方法 — 直接复用
+get_lab_report(tenant_id, report_id) → LabReportDto
+get_vital_signs(tenant_id, patient_id, metrics, range) → Vec<VitalSignDto>
+get_patient_summary(tenant_id, patient_id) → PatientSummaryDto
+get_trend_analysis_data(tenant_id, patient_id, metrics, range) → TrendAnalysisDto

-// 轻量 DTO 定义在 erp-core（只含 Tool 需要的字段，非完整 Entity）
-pub struct VitalSummary { pub indicator_type: String, pub value: f64, pub unit: String, pub recorded_at: DateTime }
-pub struct LabReportSummary { pub id: Uuid, pub report_date: DateTime, pub items: Vec<LabItemSummary> }
-pub struct LabItemSummary { pub indicator_name: String, pub value: f64, pub unit: String, pub is_abnormal: bool }
-pub struct PatientProfile { pub name: String, pub age: i32, pub gender: String, pub conditions: Vec<String> }
-pub struct AppointmentSummary { pub id: Uuid, pub department: String, pub doctor_name: String, pub scheduled_at: DateTime, pub status: String }
-pub struct MedicationSummary { pub name: String, pub dosage: String, pub frequency: String }
+// 新增方法 — Phase 0 扩展
+get_appointments(tenant_id, patient_id) → Vec<AppointmentSummaryDto>
+get_medication_list(tenant_id, patient_id) → Vec<MedicationSummaryDto>
 ```

-**注册机制**：`AppState` 中持有 `Arc<dyn HealthDataQuery>`，erp-health 模块注册时注入实现。Agent Tool 通过 `ToolContext` 访问。
+**优势**：现有的 DTO 已做 PII 脱敏（PatientSummaryDto 用 age_group/sex 而非姓名/身份证），Tool 无需额外脱敏处理。

-**对 erp-ai 模块中已有的分析能力**（analysis_service、copilot_engine 等），无需跨 crate，直接在 erp-ai 内部调用。
+**注册机制**：`AiState` 已持有 `Arc<dyn HealthDataProvider>`（state.rs:25），Tool 通过 `ToolContext.health_provider` 访问。

 ### 3.3 DisplayHint 定义

@@ -155,22 +150,22 @@ pub enum DisplayHint {

 ### 3.4 Tool 清单

-#### 第一类：数据查询（只读，从 erp-health 取数据）
+#### 第一类：数据查询（只读，通过 HealthDataProvider 访问）

 | Tool 名称 | 功能 | 对接现有能力 |
 |-----------|------|-------------|
-| `query_patient_vitals` | 查询患者最近体征数据（血压/血糖/心率等） | `health_indicator` entity |
-| `query_lab_reports` | 查询患者最近化验报告及指标 | `health_lab_report` + `lab_report_item` |
-| `query_patient_profile` | 查询患者基本信息、病史、过敏史 | `patient` entity |
-| `query_appointments` | 查询患者预约记录 | `appointment` entity |
-| `query_medication` | 查询患者当前用药情况 | `medication` entity |
+| `query_patient_vitals` | 查询患者最近体征数据（血压/血糖/心率等） | `get_vital_signs` |
+| `query_lab_reports` | 查询患者最近化验报告及指标 | `get_lab_report` |
+| `query_patient_profile` | 查询患者基本信息、病史、过敏史 | `get_patient_summary` |
+| `query_appointments` | 查询患者预约记录 | `get_appointments`（**新增方法**） |
+| `query_medication` | 查询患者当前用药情况 | `get_medication_list`（**新增方法**） |

 #### 第二类：AI 分析触发（调用 erp-ai 现有能力）

 | Tool 名称 | 功能 | 对接现有能力 |
 |-----------|------|-------------|
 | `analyze_lab_report` | 分析指定化验报告，返回异常指标解读 | `analysis_service`（非流式调用） |
-| `analyze_health_trends` | 分析体征趋势变化，识别异常模式 | `trend_analysis` |
+| `analyze_health_trends` | 分析体征趋势变化，识别异常模式 | `get_trend_analysis_data` + `analysis_service` |
 | `get_health_insights` | 获取患者当前风险洞察和 AI 建议 | `copilot_engine` + `insight_service` |

 #### 第三类：知识与服务（对话策略支撑）
@@ -181,6 +176,21 @@ pub enum DisplayHint {
 | `recommend_services` | 根据症状/需求推荐科室或服务 | 新增，基于规则 + 知识库 |
 | `check_alert_rules` | 检查是否触发告警阈值 | `local_rules_engine` + `ai_risk_threshold` |

+#### 第四类：行动（写入操作，需更高权限）
+
+| Tool 名称 | 功能 | 对接现有能力 |
+|-----------|------|-------------|
+| `create_appointment` | 帮用户预约挂号 | `appointment_service` |
+| `transfer_to_human` | 转接人工客服/值班医生 | 新增，WebSocket 通知 |
+
+### 3.5 权限与安全
+
+- **数据查询 Tool**：自动注入 `tenant_id` + `patient_id` 过滤，LLM 无法绕过多租户隔离
+- **分析触发 Tool**：走现有配额管控（`QuotaService`）
+- **行动 Tool**：需额外权限标记，System Prompt 约束 LLM 只在用户明确请求时调用
+- **数据脱敏**：`HealthDataProvider` 返回的 DTO 已做 PII 脱敏（用 age_group/sex 而非真名/身份证），Tool 层无需额外处理
+- **审计日志**：每次 Tool Call 记录到 `ai_tool_call_logs` 表
+
 ### 3.6 权限码声明

 现有权限码 `ai.chat.send` 保留用于发送消息。新增以下权限码：
@@ -194,21 +204,6 @@ pub enum DisplayHint {

 行动类 Tool（`create_appointment`、`transfer_to_human`）不单独声明权限，由 Agent 内部根据用户角色判断。

-#### 第四类：行动（写入操作，需更高权限）
-
-| Tool 名称 | 功能 | 对接现有能力 |
-|-----------|------|-------------|
-| `create_appointment` | 帮用户预约挂号 | `appointment_service` |
-| `transfer_to_human` | 转接人工客服/值班医生 | 新增，WebSocket 通知 |
-
-### 3.3 权限与安全
-
- **数据查询 Tool**：自动注入 `tenant_id` + `patient_id` 过滤，LLM 无法绕过多租户隔离
- **分析触发 Tool**：走现有配额管控（`QuotaService`）
- **行动 Tool**：需额外权限标记，System Prompt 约束 LLM 只在用户明确请求时调用
- **数据脱敏**：所有 Tool 返回数据在 Tool 层做 PII 脱敏，不传给 LLM
- **审计日志**：每次 Tool Call 记录到 `ai_tool_call_logs` 表
-
 ---

 ## 4. 多策略对话流
@@ -245,7 +240,7 @@ Agent 通过 System Prompt 定义 5 种策略方向，LLM 根据用户表达的

 3. 【服务推荐】当用户表达就医需求或身体不适时：
   - 调用 recommend_services 推荐合适科室
-   - 调用 check_appointments 查看可用时段
+   - 调用 query_appointments 查看已有预约
   - 主动提出帮用户预约

 4. 【风险预警】当用户描述的症状或数据异常时：
@@ -290,7 +285,7 @@ Agent 通过 System Prompt 定义 5 种策略方向，LLM 根据用户表达的
 ### 4.4 会话记忆

 - **短期记忆**：当前会话完整对话历史，DB 持久化 `ai_chat_messages` 表
- **长期记忆**：用户画像摘要（偏好、常见问题、健康关注点），每次新会话加载
+- **长期记忆**：用户画像摘要（偏好、常见问题、健康关注点），存储在 `ai_user_profiles` 表，每次新会话加载
 - **上下文窗口管理**：历史消息按重要性截断，保留最近 10 轮 + 关键上下文摘要

 ---
@@ -302,20 +297,21 @@ Agent 通过 System Prompt 定义 5 种策略方向，LLM 根据用户表达的
 ```
 现有能力                          Agent 集成方式
 ─────────                        ──────────────
-analysis_service (SSE)     →  Tool: analyze_lab_report / analyze_health_trends
-                                非 SSE 模式调用，直接拿结果返回给 Agent
-copilot_engine (风险评分)   →  Tool: get_health_insights
-                                调用 scoring + rules，返回结构化风险信息
-knowledge (structured)     →  Tool: search_medical_knowledge
-                                查询 KDIGO 规则、科室指南、科普文章
-local_rules_engine         →  Tool: check_alert_rules
-                                评估当前数据是否触发告警
-quota_service              →  Agent Orchestrator 内部调用
-                                每轮 Tool Call 前检查配额
-usage_service              →  Agent Orchestrator 内部调用
-                                记录每轮 token 消耗
-cache_service              →  分析类 Tool 内部复用
-                                相同参数的重复分析走缓存
+HealthDataProvider trait    →  Tool 数据查询层（已有注入，新增 2 方法）
+analysis_service (SSE)      →  Tool: analyze_lab_report / analyze_health_trends
+                                 非 SSE 模式调用，直接拿结果返回给 Agent
+copilot_engine (风险评分)    →  Tool: get_health_insights
+                                 调用 scoring + rules，返回结构化风险信息
+knowledge (structured)      →  Tool: search_medical_knowledge
+                                 查询 KDIGO 规则、科室指南、科普文章
+local_rules_engine          →  Tool: check_alert_rules
+                                 评估当前数据是否触发告警
+quota_service               →  Agent Orchestrator 内部调用
+                                 每轮 Tool Call 前检查配额
+usage_service               →  Agent Orchestrator 内部调用
+                                 记录每轮 token 消耗
+cache_service               →  分析类 Tool 内部复用
+                                 相同参数的重复分析走缓存
 ```

 ### 5.2 数据模型新增
@@ -365,6 +361,8 @@ CREATE TABLE ai_chat_messages (

 #### ai_tool_call_logs — AI 工具调用日志

+> 此表为仅追加日志（append-only），记录后不更新，因此省略 `updated_at`/`updated_by`/`version`/`deleted_at`。
+
 ```sql
 CREATE TABLE ai_tool_call_logs (
  id UUID PRIMARY KEY,
@@ -376,12 +374,15 @@ CREATE TABLE ai_tool_call_logs (
  result_summary TEXT,
  execution_ms INTEGER,
  success BOOLEAN NOT NULL,
-  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
+  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
+  created_by UUID
 );
 ```

 #### ai_user_profiles — 用户长期画像（长期记忆）

+> `created_by`/`updated_by` 省略，因为此表由 Agent 自动维护而非用户手动创建。
+
 ```sql
 CREATE TABLE ai_user_profiles (
  id UUID PRIMARY KEY,
@@ -404,45 +405,56 @@ CREATE TABLE ai_user_profiles (

 ### 5.3 PII 脱敏规范

-Tool 返回数据在传给 LLM 前必须脱敏。具体规则：
+`HealthDataProvider` 返回的 DTO 已做 PII 脱敏（第 8 行注释："返回的 DTO 已脱去 PII"），Tool 层无需额外处理。
+
+对于 Tool 中可能出现的补充数据，统一通过 `sanitize_for_llm()` 函数处理：

 | 字段类型 | 脱敏方式 | 示例 |
 |----------|----------|------|
 | 患者姓名 | 保留姓氏 + 称呼 | "张爷爷"（Agent 用称呼，不用真名） |
-| 身份证号 | 不传给 LLM | Tool 层过滤，不出现在 ToolResult |
+| 身份证号 | 不传给 LLM | Tool 层过滤 |
 | 手机号 | 不传给 LLM | 同上 |
-| 具体住址 | 不传给 LLM | 同上 |
 | 出生日期 | 转为年龄 | "68 岁" |
 | 医疗数据 | 正常传递 | 血压值、化验指标等不脱敏 |

-脱敏在 `AgentTool::execute()` 实现中统一处理，每个 Tool 的返回值必须经过 `sanitize_for_llm()` 函数。
+### 5.4 Provider Function Calling 适配

-### 5.3 Provider Function Calling 适配
-
-现有 `AiProvider` trait 的 `GenerateRequest` 不支持 tools/functions 参数，需要扩展：
+现有 `AiProvider` trait 的 `generate()` 方法保持不变（现有分析端点继续使用）。新增 `generate_with_tools()` 方法：

 ```rust
-// 扩展 GenerateRequest
-pub struct GenerateRequest {
-    pub system_prompt: String,
-    pub messages: Vec<ChatMessage>,       // 从单条 user_prompt 改为多轮消息
-    pub model: Option<String>,
-    pub temperature: Option<f32>,
-    pub max_tokens: Option<u32>,
-    pub tools: Option<Vec<ToolDefinition>>,  // 新增
+#[async_trait]
+pub trait AiProvider: Send + Sync {
+    // 保留 — 现有分析端点继续使用
+    async fn stream_generate(&self, req: GenerateRequest)
+        -> AiResult<Pin<Box<dyn Stream<Item = AiResult<String>> + Send>>>;
+    async fn generate(&self, req: GenerateRequest)
+        -> AiResult<GenerateResponse>;
+    fn name(&self) -> &str;
+    async fn health_check(&self) -> AiResult<bool>;
+
+    // 新增 — Agent 专用，支持 Function Calling
+    async fn generate_with_tools(
+        &self,
+        messages: Vec<ChatMessage>,
+        tools: Vec<ToolDefinition>,
+        options: GenerateOptions,
+    ) -> AiResult<AgentGenerateResponse> {
+        // 默认实现：不支持 FC 的 Provider 返回错误
+        Err(AiError::UnsupportedOperation("Function Calling not supported".into()))
+    }
 }

 pub struct ChatMessage {
    pub role: MessageRole,  // User / Assistant / Tool
    pub content: String,
-    pub tool_calls: Option<Vec<ToolCall>>,  // assistant 消息中的 tool call
-    pub tool_call_id: Option<String>,       // tool 消息的关联 ID
+    pub tool_calls: Option<Vec<ToolCall>>,
+    pub tool_call_id: Option<String>,
 }

 pub struct ToolDefinition {
    pub name: String,
    pub description: String,
-    pub parameters: serde_json::Value,  // JSON Schema
+    pub parameters: serde_json::Value,
 }

 pub struct ToolCall {
@@ -451,20 +463,19 @@ pub struct ToolCall {
    pub arguments: serde_json::Value,
 }

-// GenerateResponse 扩展
-pub struct GenerateResponse {
+pub struct AgentGenerateResponse {
    pub content: Option<String>,
-    pub tool_calls: Option<Vec<ToolCall>>,  // LLM 返回的 tool call
+    pub tool_calls: Option<Vec<ToolCall>>,
    pub usage: Option<TokenUsage>,
 }
 ```

 **各 Provider 适配工作量**：
- **Claude**：Anthropic API 使用 `tool_use`/`tool_result` 内容块，需重构消息构建和响应解析（1 天）
+- **Claude**：Anthropic API 使用 `tool_use`/`tool_result` 内容块（1 天）
 - **OpenAI**：使用 `function` 或 `tool` 类型消息，相对标准（0.5 天）
- **Ollama**：Function Calling 支持取决于模型，若不支持则降级为纯文本 Prompt 模式（0.5 天）
+- **Ollama**：若模型不支持 FC，返回 `UnsupportedOperation`，Orchestrator 降级为纯 Prompt 模式（0.5 天）

-### 5.4 API 设计
+### 5.5 API 设计

 ```
 POST   /api/v1/ai/chat/sessions                  — 创建会话
@@ -475,7 +486,7 @@ POST   /api/v1/ai/chat/sessions/{id}/messages    — 发送消息（触发 Agent
 GET    /api/v1/ai/chat/sessions/{id}/messages    — 消息历史
 ```

-发送消息端点支持 SSE 流式输出（Agent 最终回复）和 JSON 响应两种模式。
+发送消息端点支持 SSE 流式输出（仅 Agent 最终回复）和 JSON 响应两种模式。

 ---

@@ -484,9 +495,9 @@ GET    /api/v1/ai/chat/sessions/{id}/messages    — 消息历史
 ### 6.1 消息类型扩展

 - **文本消息** — 正常对话内容
- **数据卡片** — "这是您最近的血压趋势" + 小图表
- **操作确认** — "帮您预约了周三上午心内科，确认吗？" + 确认/取消按钮
- **转接通知** — "正在为您转接值班医生..."
+- **数据卡片** — "这是您最近的血压趋势" + 小图表（通过 `DisplayHint::VitalCard` 触发）
+- **操作确认** — "帮您预约了周三上午心内科，确认吗？" + 确认/取消按钮（通过 `DisplayHint::ActionConfirm` 触发）
+- **转接通知** — "正在为您转接值班医生..."（通过 `DisplayHint::RiskAlert` 触发）

 ### 6.2 会话管理

@@ -494,9 +505,16 @@ GET    /api/v1/ai/chat/sessions/{id}/messages    — 消息历史
 - 新建会话 / 继续会话
 - 历史从本地 Storage 迁移到 DB 持久化

-### 6.3 小程序 + Web 同步
+### 6.3 小程序 + Web 实现

-两套前端复用相同的 API 模块，UI 各自适配平台规范。
+**小程序**：
+- SSE 兼容：Taro 原生不支持 SSE，使用 `requestTask` 长连接或降级为轮询（Phase 2 明确分配 1 天处理）
+- 富消息渲染：基于 `DisplayHint` 类型分发到不同渲染组件
+- 旧数据迁移：`ai-chat.ts` 中 `getLocalHistory()` 的本地 Storage 数据，首次打开新版本时一键上传到 DB
+
+**Web**：
+- AI 客服页面从零构建（无现有 UI），包含会话列表 + 聊天界面 + 富消息渲染
+- 复用小程序的 API 模块（`services/ai-chat.ts`），UI 适配 Ant Design

 ---

@@ -508,12 +526,11 @@ GET    /api/v1/ai/chat/sessions/{id}/messages    — 消息历史

 | 任务 | 工作量 |
 |------|--------|
-| `GenerateRequest` + `GenerateResponse` 扩展（支持 tools/functions） | 1 天 |
-| Claude Provider Function Calling 适配（消息构建 + 响应解析） | 1 天 |
+| `AiProvider` trait 新增 `generate_with_tools` + Claude Provider 适配 | 1 天 |
 | OpenAI + Ollama Provider 适配 | 1 天 |
 | `AgentTool` trait + `ToolRegistry` + `ToolContext` + `DisplayHint` | 0.5 天 |
-| `AgentOrchestrator` ReAct 循环 | 0.5 天 |
-| erp-core `HealthDataQuery` trait 定义 + erp-health 实现 | 1 天 |
+| `AgentOrchestrator` ReAct 循环（含 5 轮上限强制终止逻辑） | 1 天 |
+| `HealthDataProvider` trait 扩展：新增 `get_appointments`/`get_medication_list` | 1 天 |
 | 数据库迁移：`ai_chat_sessions` + `ai_chat_messages` + `ai_tool_call_logs` + `ai_user_profiles` | 0.5 天 |
 | 实现 1 个 Tool：`query_patient_vitals`（验证端到端链路） | 0.5 天 |
 | 改造 `chat_handler`：接入 Orchestrator，替换原有简单逻辑 | 0.5 天 |
@@ -544,7 +561,7 @@ GET    /api/v1/ai/chat/sessions/{id}/messages    — 消息历史
 |------|--------|
 | 后端：会话 CRUD API（创建/列表/历史消息） | 1 天 |
 | 后端：Agent 最终回复走 SSE 流式输出 | 1 天 |
-| 小程序：SSE 兼容层（Taro 原生不支持 SSE，需用 `requestTask` 或轮询适配） | 1 天 |
+| 小程序：SSE 兼容层（Taro `requestTask` 或轮询适配） | 1 天 |
 | 小程序：会话列表页 + 消息历史页 + 富消息渲染 | 2 天 |
 | Web：AI 客服页面从零构建（会话列表 + 聊天界面 + 富消息） | 2 天 |
 | 数据卡片渲染（体征趋势小图表） | 1 天 |
@@ -582,30 +599,25 @@ Phase 3 ██████████ (3-5天)

 ---

-## 9. 故障处理与降级
+## 8. 故障处理与降级

 | 故障场景 | 用户看到什么 | 处理方式 |
 |----------|-------------|----------|
 | 所有 Provider 不可用 | "小华暂时无法回复，请稍后再试" | 返回固定降级消息，记录到 usage_service |
 | Agent 循环超时（60s） | 已生成的部分回复 + "回复被中断，请重新提问" | SSE 断流 + 超时日志 |
+| Agent 达到 5 轮上限 | 正常回复（Orchestrator 追加 "请基于已有信息总结回复" 指令强制 LLM 结束） | 用户无感知，回复可能不够完整 |
 | 单个 Tool 执行超时（10s） | Agent 跳过该 Tool 继续推理 | ToolResult 返回错误摘要，Agent 可选择其他路径 |
-| Ollama 不支持 Function Calling | 自动降级为纯文本 Prompt 模式 | Provider 层检测能力，无 Function Calling 时将 Tool 描述注入 System Prompt |
+| Ollama 不支持 Function Calling | 自动降级为纯文本 Prompt 模式 | Provider 层返回 UnsupportedOperation，Orchestrator 将 Tool 描述注入 System Prompt |
 | LLM 返回无效 Tool Call | "抱歉，我刚才思考有误，请再说一次" | Orchestrator 捕获解析错误，返回重试提示 |

 ---

-## 10. 风险与缓解
-
-每个 Phase 结束后都有可演示的交付物。
-
---
-
-## 8. 风险与缓解
+## 9. 风险与缓解

 | 风险 | 概率 | 影响 | 缓解措施 |
 |------|------|------|----------|
-| Function Calling 格式跨 Provider 不统一 | 中 | 高 | Phase 0 就在 3 个 Provider 上验证 |
+| Function Calling 格式跨 Provider 不统一 | 中 | 高 | Phase 0 就在 3 个 Provider 上验证，Ollama 降级方案已设计 |
 | LLM 幻觉（编造数据/错误诊断） | 高 | 严重 | System Prompt 强约束 + Tool 返回数据做事实校验 + 免责声明 |
 | Token 成本超预期 | 中 | 中 | 每轮配额检查 + 缓存复用 + 5 轮上限 |
 | Tool 执行超时 | 低 | 中 | 单个 Tool 超时 10s，总轮次超时 60s |
-| PII 泄露给 LLM | 低 | 严重 | Tool 层脱敏，敏感字段不传给 Provider |
+| PII 泄露给 LLM | 低 | 严重 | HealthDataProvider DTO 已脱敏，Tool 层补充 sanitize_for_llm() |