zclaw_openfang/docs/features/04-skills-ecosystem/01-intelligent-routing.md

# 智能技能路由系统

> **设计目标**: 让 ZCLAW 能智能地理解用户意图，自动选择和调用合适的技能，而不是依赖硬编码的触发词。

---

## 一、问题分析

### 1.1 当前方案的问题

```
用户: "查询腾讯财报"
       ↓
硬编码触发词匹配: "财报" ∈ triggers?
       ↓
❌ 如果 triggers 中没有 "财报"，技能不会被调用
```

**问题**:
1. **无法覆盖所有表达方式** - 用户可能说 "财务数据"、"盈利情况"、"营收报告"...
2. **维护成本高** - 每个技能都需要维护触发词列表
3. **无语义理解** - 无法理解 "帮我分析一下这家公司的赚钱能力" 也是财务分析

### 1.2 设计目标

```
用户: "帮我分析一下腾讯最近赚了多少钱"
       ↓
语义理解: 意图 = 财务分析, 实体 = 腾讯, 指标 = 盈利
       ↓
智能路由: 最佳匹配技能 = finance-tracker
       ↓
✅ 自动调用 execute_skill("finance-tracker", {company: "腾讯", metrics: ["profit"]})
```

---

## 二、智能路由架构

### 2.1 三层架构

```
┌─────────────────────────────────────────────────────────────────┐
│                     LLM Orchestrator                             │
│  - 理解用户意图                                                   │
│  - 决定是否需要调用技能                                           │
│  - 选择最佳技能                                                   │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                   Semantic Skill Router                          │
│  - 技能描述向量化                                                 │
│  - 查询-技能语义匹配                                              │
│  - Top-K 候选检索                                                │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Skill Registry                               │
│  - 77 个技能的元数据                                              │
│  - 描述、能力、示例                                               │
│  - 向量索引                                                       │
└─────────────────────────────────────────────────────────────────┘
```

### 2.2 路由流程

```
用户消息
    │
    ▼
┌─────────────────────┐
│ 1. 意图分类         │ ──→ 是否需要技能?
│    (LLM 判断)       │     ├─ 否 → 直接对话
└─────────────────────┘     └─ 是 ↓
                                  │
                                  ▼
                    ┌─────────────────────┐
                    │ 2. 语义检索         │ ──→ Top-3 候选技能
                    │    (Embedding)      │     (基于描述相似度)
                    └─────────────────────┘
                                  │
                                  ▼
                    ┌─────────────────────┐
                    │ 3. 精细选择         │ ──→ 最佳技能 + 参数
                    │    (LLM 决策)       │     (考虑上下文、依赖)
                    └─────────────────────┘
                                  │
                                  ▼
                    ┌─────────────────────┐
                    │ 4. 技能执行         │ ──→ 执行结果
                    │    (execute_skill)  │
                    └─────────────────────┘
                                  │
                                  ▼
                              最终响应
```

---

## 三、核心组件设计

### 3.1 丰富的技能描述

**问题**: 当前技能描述过于简单

```yaml
# 当前 (不够丰富)
name: finance-tracker
description: "财务追踪专家"
triggers: ["财报", "财务分析"]
```

**改进**: 添加语义丰富的描述

```yaml
# 改进后
name: finance-tracker
description: |
  财务追踪专家 - 专注于企业财务数据分析、财报解读、盈利能力评估。

  核心能力:
  - 财务报表分析 (资产负债表、利润表、现金流量表)
  - 盈利能力指标 (毛利率、净利率、ROE、ROA)
  - 营收增长分析 (同比、环比、复合增长率)
  - 财务健康评估 (流动性、偿债能力、运营效率)

  适用场景:
  - 用户询问某公司的盈利、营收、利润
  - 需要分析财务数据、财报数据
  - 投资分析、估值计算
  - 财务风险评估

  不适用场景:
  - 实时股价查询 → 使用 market-data
  - 行业分析 → use industry-analyst
  - 新闻资讯 → use news-collector

examples:
  - "腾讯去年赚了多少钱"
  - "分析一下苹果的财务状况"
  - "帮我看看这份财报"
  - "这家公司的盈利能力如何"
  - "对比一下阿里和京东的营收"

capabilities:
  - financial_analysis
  - report_generation
  - data_visualization
```

### 3.2 语义路由器实现

```rust
// crates/zclaw-kernel/src/skill_router.rs

use std::sync::Arc;
use serde::{Deserialize, Serialize};

/// 技能路由结果
#[derive(Debug, Clone)]
pub struct RoutingResult {
    pub skill_id: String,
    pub confidence: f32,
    pub parameters: serde_json::Value,
    pub reasoning: String,
}

/// 语义技能路由器
pub struct SemanticSkillRouter {
    skills: Arc<SkillRegistry>,
    embedder: Box<dyn Embedder>,
    skill_embeddings: Vec<(String, Vec<f32>)>,
}

impl SemanticSkillRouter {
    /// 检索 Top-K 候选技能
    pub async fn retrieve_candidates(&self, query: &str, top_k: usize) -> Vec<(SkillManifest, f32)> {
        // 1. 将查询向量化
        let query_embedding = self.embedder.embed(query).await;

        // 2. 计算与所有技能的相似度
        let mut scores: Vec<_> = self.skill_embeddings
            .iter()
            .map(|(skill_id, embedding)| {
                let similarity = cosine_similarity(&query_embedding, embedding);
                (skill_id.clone(), similarity)
            })
            .collect();

        // 3. 排序并返回 Top-K
        scores.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
        scores.truncate(top_k);

        // 4. 返回技能元数据
        scores.into_iter()
            .filter_map(|(id, score)| {
                self.skills.get(&id).map(|s| (s, score))
            })
            .collect()
    }

    /// 智能路由 - 结合语义检索和 LLM 决策
    pub async fn route(&self, query: &str, context: &ConversationContext) -> Option<RoutingResult> {
        // Step 1: 语义检索 Top-3 候选
        let candidates = self.retrieve_candidates(query, 3).await;

        if candidates.is_empty() {
            return None;
        }

        // Step 2: 如果最高分超过阈值，直接返回
        if candidates[0].1 > 0.85 {
            let (skill, _) = &candidates[0];
            return Some(RoutingResult {
                skill_id: skill.id.to_string(),
                confidence: candidates[0].1,
                parameters: extract_parameters(query, &skill.id),
                reasoning: format!("High semantic match ({}%)", (candidates[0].1 * 100.0) as i32),
            });
        }

        // Step 3: 否则让 LLM 精细选择
        self.llm_select_skill(query, candidates, context).await
    }

    /// LLM 精细选择
    async fn llm_select_skill(
        &self,
        query: &str,
        candidates: Vec<(SkillManifest, f32)>,
        context: &ConversationContext,
    ) -> Option<RoutingResult> {
        let prompt = self.build_selection_prompt(query, &candidates, context);

        // 调用 LLM 进行选择
        let response = self.llm.complete(&prompt).await?;

        // 解析 LLM 响应
        parse_llm_routing_response(&response, candidates)
    }

    fn build_selection_prompt(
        &self,
        query: &str,
        candidates: &[(SkillManifest, f32)],
        context: &ConversationContext,
    ) -> String {
        format!(
            r#"You are a skill router. Analyze the user query and select the best skill to handle it.

## User Query
{}

## Conversation Context
{}

## Candidate Skills
{}

## Instructions
1. Analyze the user's intent and required capabilities
2. Select the MOST appropriate skill from the candidates
3. Extract any parameters mentioned in the query
4. If no skill is appropriate, respond with "none"

## Response Format (JSON)
{{
  "selected_skill": "skill_id or null",
  "confidence": 0.0-1.0,
  "parameters": {{}},
  "reasoning": "Brief explanation"
}}
"#,
            query,
            context.summary(),
            candidates.iter()
                .map(|(s, score)| format!("- {} ({}%): {}", s.id, (score * 100.0) as i32, s.description))
                .collect::<Vec<_>>()
                .join("\n")
        )
    }
}

fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
    let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
    let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
    let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
    dot / (norm_a * norm_b + 1e-10)
}
```

### 3.3 系统提示词增强

```rust
// 在 kernel.rs 中

/// 构建智能技能提示
fn build_skill_aware_system_prompt(&self, base_prompt: Option<&String>) -> String {
    let mut prompt = base_prompt
        .map(|p| p.clone())
        .unwrap_or_else(|| "You are ZCLAW, an intelligent AI assistant.".to_string());

    prompt.push_str("\n\n## Your Capabilities\n\n");
    prompt.push_str("You have access to specialized skills. Use the `execute_skill` tool when:\n");
    prompt.push_str("- The user's request matches a skill's domain\n");
    prompt.push_str("- You need specialized expertise for a task\n");
    prompt.push_str("- The task would benefit from a structured workflow\n\n");

    prompt.push_str("**Important**: You should autonomously decide when to use skills based on your understanding of the user's intent. ");
    prompt.push_str("Do not wait for explicit skill names - recognize the need and act.\n\n");

    prompt.push_str("## Available Skills\n\n");

    // 注入技能摘要 (不是完整列表，减少 token)
    let skills = futures::executor::block_on(self.skills.list());
    for skill in skills.iter().take(20) {  // 只展示前 20 个最相关的
        prompt.push_str(&format!(
            "- **{}**: {}\n",
            skill.id.as_str(),
            &skill.description[..skill.description.char_indices().take(100).last().map(|(i, _)| i).unwrap_or(skill.description.len())]
        ));
    }

    if skills.len() > 20 {
        prompt.push_str(&format!("\n... and {} more skills available.\n", skills.len() - 20));
    }

    prompt
}
```

---

## 四、实现计划

### Phase 1: 基础架构 (当前)

- [x] 在系统提示词中注入技能列表
- [x] 添加 `triggers` 字段到 SkillManifest
- [x] 更新 SKILL.md 解析器

### Phase 2: 语义路由

1. **集成 Embedding 模型**
   - 使用本地模型 (如 `all-MiniLM-L6-v2`)
   - 或调用 LLM API 获取 embedding

2. **构建技能向量索引**
   - 启动时预计算所有技能描述的 embedding
   - 支持增量更新

3. **实现 Hybrid Router**
   - 语义检索 Top-K 候选
   - LLM 精细选择

### Phase 3: 智能编排

1. **多技能协调**
   - 识别需要多个技能的任务
   - 自动编排执行顺序

2. **上下文感知**
   - 根据对话历史调整技能选择
   - 记住用户偏好

3. **自主学习**
   - 记录用户反馈
   - 优化路由策略

---

## 五、技术选型

### 5.1 Embedding 模型

| 选项 | 优点 | 缺点 |
|------|------|------|
| **本地 `all-MiniLM-L6-v2`** | 快速、离线、免费 | 需要额外依赖 |
| **LLM API Embedding** | 高质量 | 需要网络、有成本 |
| **OpenAI text-embedding-3-small** | 高质量、多语言 | 需要付费 |

**推荐**: 使用 LLM Provider 的 embedding API (如果支持)，否则使用本地模型。

### 5.2 向量存储

| 选项 | 适用场景 |
|------|---------|
| **内存 HashMap** | 技能数量 < 100 |
| **SQLite + vec** | 持久化、简单 |
| **Qdrant/Chroma** | 大规模、需要过滤 |

**推荐**: 对于 77 个技能，内存 HashMap 足够。

---

## 六、参考资料

- [LLM Skills vs Tools: The Missing Layer in Agent Design](https://www.abstractalgorithms.dev/llm-skills-vs-tools-in-agent-design)
- [Tool Selection for LLM Agents: Routing Strategies](https://mbrenndoerfer.com/writing/tool-selection-llm-agents-routing-strategies)
- [Semantic Tool Selection](https://vllm-semantic-router.com/zh-Hans/blog/semantic-tool-selection)

---

## 七、总结

**核心原则**:
1. **让 LLM 自主决策** - 不要硬编码触发词
2. **语义理解优于关键词匹配** - 理解用户意图
3. **Hybrid 是最佳实践** - embedding 过滤 + LLM 决策
4. **丰富的描述是关键** - 技能描述要有示例、边界、能力

**下一步**:
1. 实现语义路由器原型
2. 增强技能描述
3. 测试和优化