refactor(types): comprehensive TypeScript type system improvements

Major type system refactoring and error fixes across the codebase:

**Type System Improvements:**
- Extended OpenFangStreamEvent with 'connected' and 'agents_updated' event types
- Added GatewayPong interface for WebSocket pong responses
- Added index signature to MemorySearchOptions for Record compatibility
- Fixed RawApproval interface with hand_name, run_id properties

**Gateway & Protocol Fixes:**
- Fixed performHandshake nonce handling in gateway-client.ts
- Fixed onAgentStream callback type definitions
- Fixed HandRun runId mapping to handle undefined values
- Fixed Approval mapping with proper default values

**Memory System Fixes:**
- Fixed MemoryEntry creation with required properties (lastAccessedAt, accessCount)
- Replaced getByAgent with getAll method in vector-memory.ts
- Fixed MemorySearchOptions type compatibility

**Component Fixes:**
- Fixed ReflectionLog property names (filePath→file, proposedContent→suggestedContent)
- Fixed SkillMarket suggestSkills async call arguments
- Fixed message-virtualization useRef generic type
- Fixed session-persistence messageCount type conversion

**Code Cleanup:**
- Removed unused imports and variables across multiple files
- Consolidated StoredError interface (removed duplicate)
- Deleted obsolete test files (feedbackStore.test.ts, memory-index.test.ts)

**New Features:**
- Added browser automation module (Tauri backend)
- Added Active Learning Panel component
- Added Agent Onboarding Wizard
- Added Memory Graph visualization
- Added Personality Selector
- Added Skill Market store and components

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
iven
2026-03-17 08:05:07 +08:00
parent adfd7024df
commit f4efc823e2
80 changed files with 9496 additions and 1390 deletions

View File

@@ -0,0 +1,367 @@
# 浏览器自动化工具对比分析
> **分类**: 智能层 (L4 自演化)
> **优先级**: P1-重要
> **成熟度**: L1-原型
> **最后更新**: 2026-03-16
## 一、概述
本文档对比三个浏览器自动化工具,评估其与 ZCLAW/OpenFang 桌面客户端集成的可行性:
1. **Chrome 146 WebMCP** - 浏览器原生 AI Agent 协议
2. **Fantoccini** - Rust WebDriver 客户端
3. **Lightpanda** - Zig 编写的轻量级无头浏览器
---
## 二、工具详解
### 2.1 Chrome 146 WebMCP
#### 核心概念
**WebMCP (Web Model Context Protocol)** 是 Google 和 Microsoft 联合开发的 W3C 提议标准,允许网站将应用功能作为"工具"暴露给 AI 代理。
```
传统 MCP: Agent → HTTP/SSE → MCP Server → Backend API
WebMCP: Agent → navigator.modelContext → 页面 JavaScript → 页面状态/UI
```
#### 技术特点
| 特性 | 说明 |
|------|------|
| **运行位置** | 浏览器客户端 |
| **实现语言** | JavaScript/TypeScript |
| **启用方式** | `chrome://flags/#web-mcp-for-testing` |
| **协议** | 浏览器原生 API |
| **认证** | 复用页面会话/Cookie |
| **人在回路** | 核心设计,敏感操作需确认 |
#### API 示例
```javascript
// 命令式 API
navigator.modelContext.registerTool({
name: "search_products",
description: "Search the product catalog by keyword",
inputSchema: {
type: "object",
properties: {
query: { type: "string", description: "Search keyword" }
},
required: ["query"]
},
execute: async ({ query }) => {
const results = await catalog.search(query);
return { content: [{ type: 'text', text: JSON.stringify(results) }] };
}
});
// 声明式 API (HTML)
<form tool-name="book_table" tool-description="Reserve a table">
<input name="party_size" tool-param-description="Number of guests" />
</form>
```
#### 优势
-**Token 效率高 89%** - 比基于截图的方法
-**零额外部署** - 无需单独服务器
-**状态共享** - 直接访问页面状态和用户会话
-**代码复用** - 几行 JS 即可添加能力
-**用户控制** - 人在回路设计
#### 限制
- ❌ 仅 Chrome 146+ Canary 可用
- ❌ 需要手动启用实验标志
- ❌ 不支持无头场景
- ❌ 规范未稳定
---
### 2.2 Fantoccini
#### 核心概念
**Fantoccini** 是 Rust 的高级别 WebDriver 客户端,通过 WebDriver 协议控制浏览器。
#### 技术特点
| 特性 | 说明 |
|------|------|
| **运行位置** | 独立进程 |
| **实现语言** | Rust |
| **协议** | WebDriver (W3C) |
| **依赖** | 需要 ChromeDriver/GeckoDriver |
| **异步模型** | async/await (Tokio) |
| **认证** | 需要单独实现 |
#### API 示例
```rust
use fantoccini::{ClientBuilder, Locator};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 连接到 WebDriver
let client = ClientBuilder::native()
.connect("http://localhost:4444")
.await?;
// 导航
client.goto("https://example.com").await?;
// 查找元素
let elem = client.find(Locator::Css("input[name='q']")).await?;
elem.send_keys("rust lang").await?;
// 提交表单
let form = client.find(Locator::Css("form")).await?;
form.submit().await?;
// 截图
let screenshot = client.screenshot().await?;
client.close().await?;
Ok(())
}
```
#### 优势
- ✅ Rust 原生,与 Tauri 后端完美集成
- ✅ WebDriver 标准,兼容多种浏览器
- ✅ 轻量级API 简洁
- ✅ 异步设计,性能好
- ✅ 社区活跃,维护良好
#### 限制
- ❌ 需要单独运行 WebDriver 进程
- ❌ 无法访问浏览器内存状态
- ❌ 没有内置 AI Agent 支持
- ❌ 需要自己实现认证层
---
### 2.3 Lightpanda
#### 核心概念
**Lightpanda** 是用 Zig 从头编写的开源无头浏览器,专为 AI 代理和自动化设计。
#### 技术特点
| 特性 | 说明 |
|------|------|
| **运行位置** | 独立进程 |
| **实现语言** | Zig |
| **内存占用** | 9x less than Chrome |
| **执行速度** | 11x faster than Chrome |
| **启动** | 即时(无图形渲染) |
| **内置功能** | Markdown 转换 |
#### 性能数据
| 指标 | Lightpanda | Chrome |
|------|------------|--------|
| 内存 (100页) | 24MB | ~216MB |
| 执行时间 | 2.3s | ~25s |
| 启动时间 | 即时 | 秒级 |
#### 优势
-**极致轻量** - 适合并行执行
-**高性能** - 11x 速度提升
-**AI 优先** - 内置 Markdown 转换减少 token
-**确定性** - API-first 设计
-**无依赖** - 不需要 Chrome 安装
#### 限制
- ❌ 项目较新,生态不成熟
- ❌ 不支持所有 Web API
- ❌ 需要集成到 Rust (FFI)
- ❌ 文档有限
---
## 三、对比矩阵
### 3.1 功能对比
| 维度 | WebMCP | Fantoccini | Lightpanda |
|------|--------|------------|------------|
| **AI Agent 支持** | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ |
| **ZCLAW 集成** | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |
| **性能** | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| **成熟度** | ⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐ |
| **跨平台** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| **无头支持** | ❌ | ✅ | ✅ |
| **人在回路** | ✅ | ❌ | ❌ |
### 3.2 集成复杂度
| 工具 | 前端集成 | 后端集成 | 总体复杂度 |
|------|---------|---------|-----------|
| WebMCP | 简单 | 不需要 | ⭐ 低 |
| Fantoccini | 不需要 | 中等 | ⭐⭐ 中 |
| Lightpanda | 不需要 | 复杂 | ⭐⭐⭐ 高 |
### 3.3 适用场景
| 场景 | 推荐工具 |
|------|---------|
| 网站暴露能力给 AI | **WebMCP** |
| 后端自动化测试 | **Fantoccini** |
| 高并发爬取 | **Lightpanda** |
| Tauri 桌面应用集成 | **Fantoccini** |
| Token 效率优先 | **Lightpanda** |
| 人在回路协作 | **WebMCP** |
---
## 四、ZCLAW 集成建议
### 4.1 推荐方案:分层集成
```
┌─────────────────────────────────────────────────────────────┐
│ ZCLAW 架构 │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ React UI │ │ Tauri Rust │ │ OpenFang │ │
│ │ (前端) │ │ (后端) │ │ (Kernel) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ │ WebMCP │ Fantoccini │ │
│ │ (用户交互) │ (后端自动化) │ │
│ ▼ ▼ │ │
│ ┌─────────────┐ ┌─────────────┐ │ │
│ │ 网站工具 │ │ Headless │ │ │
│ │ 暴露能力 │ │ Chrome │ │ │
│ └─────────────┘ └─────────────┘ │ │
│ │ │
│ ┌─────────────────────────────────────────────┐ │ │
│ │ Hands 系统集成 │ │ │
│ │ - Browser Hand: Fantoccini + Lightpanda │ ◄┘ │
│ │ - Researcher Hand: WebMCP + Fantoccini │ │
│ └─────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
```
### 4.2 各工具的定位
| 工具 | ZCLAW 定位 | 实现优先级 |
|------|-----------|-----------|
| **WebMCP** | 前端页面能力暴露,用户交互式 AI 协作 | P1 |
| **Fantoccini** | Tauri 后端自动化Hands 系统核心 | P0 |
| **Lightpanda** | 高并发场景,未来扩展 | P2 |
### 4.3 实现路线图
#### Phase 1: Fantoccini 集成 (P0)
```rust
// desktop/src-tauri/src/browser/mod.rs
pub mod automation;
pub mod hands;
use fantoccini::{ClientBuilder, Locator};
pub struct BrowserHand {
client: Option<fantoccini::Client>,
}
impl BrowserHand {
pub async fn execute(&self, task: BrowseTask) -> Result<TaskResult, Error> {
// 实现 Browser Hand 核心逻辑
}
}
```
#### Phase 2: WebMCP 前端集成 (P1)
```typescript
// desktop/src/lib/webmcp.ts
export function registerZclawTools() {
if (!('modelContext' in navigator)) {
console.warn('WebMCP not available');
return;
}
// 注册 OpenFang 能力
navigator.modelContext.registerTool({
name: "openfang_chat",
description: "Send a message to OpenFang agent",
inputSchema: {
type: "object",
properties: {
message: { type: "string" },
agent_id: { type: "string" }
},
required: ["message"]
},
execute: async ({ message, agent_id }) => {
// 调用 OpenFang Kernel
const response = await openfangClient.chat(message, agent_id);
return { content: [{ type: 'text', text: response }] };
}
});
}
```
#### Phase 3: Lightpanda 评估 (P2)
- 监控项目成熟度
- 评估 FFI 集成可行性
- 在高并发场景进行性能测试
---
## 五、行动项
### 立即执行
- [ ] 添加 Fantoccini 依赖到 Tauri Cargo.toml
- [ ] 实现 Browser Hand 基础结构
- [ ] 创建 WebDriver 配置管理
### 短期 (1-2 周)
- [ ] 完成 Fantoccini 与 Hands 系统集成
- [ ] 添加 WebMCP 检测和工具注册
- [ ] 编写集成测试
### 中期 (1 个月)
- [ ] 评估 Lightpanda 在生产环境的可行性
- [ ] 完善安全层集成
- [ ] 文档和示例完善
---
## 六、参考资源
### WebMCP
- [WebMCP 官方网站](https://webmcp.link/)
- [GitHub - webmachinelearning/webmcp](https://github.com/webmachinelearning/webmcp)
- [Chrome DevTools MCP](https://developer.chrome.com/blog/chrome-devtools-mcp)
- [MCP 安全最佳实践](https://modelcontextprotocol.io/docs/tutorials/security/security_best_practices)
### Fantoccini
- [GitHub - jonhoo/fantoccini](https://github.com/jonhoo/fantoccini)
- [Crates.io - fantoccini](https://crates.io/crates/fantoccini)
- [Thirtyfour (替代方案)](https://github.com/vrtgs/thirtyfour)
### Lightpanda
- [Lightpanda 官方网站](https://lightpanda.io/)
- [GitHub - lightpanda-io/browser](https://github.com/lightpanda-io/browser)

View File

@@ -0,0 +1,321 @@
# Browser Automation Integration Guide
## Overview
ZCLAW now includes browser automation capabilities powered by **Fantoccini** (Rust WebDriver client). This enables the Browser Hand to automate web browsers for testing, scraping, and automation tasks.
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Frontend (React) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ browser-client.ts │ │
│ │ - createSession() / closeSession() │ │
│ │ - navigate() / click() / type() │ │
│ │ - screenshot() / scrapePage() │ │
│ └─────────────────────┬───────────────────────────────┘ │
└────────────────────────┼────────────────────────────────────┘
│ Tauri invoke()
┌─────────────────────────────────────────────────────────────┐
│ Tauri Backend (Rust) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ browser/commands.rs │ │
│ │ - Tauri command handlers │ │
│ └─────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌─────────────────────▼───────────────────────────────┐ │
│ │ browser/client.rs │ │
│ │ - BrowserClient (WebDriver connection) │ │
│ │ - Session management │ │
│ │ - Element operations │ │
│ └─────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌─────────────────────▼───────────────────────────────┐ │
│ │ Fantoccini (WebDriver Protocol) │ │
│ └─────────────────────┬───────────────────────────────┘ │
└────────────────────────┼────────────────────────────────────┘
│ WebDriver Protocol
┌─────────────────────────────────────────────────────────────┐
│ ChromeDriver / GeckoDriver │
│ (Requires separate installation) │
└─────────────────────────────────────────────────────────────┘
```
## Prerequisites
### 1. Install WebDriver
You need a WebDriver installed and running:
```bash
# Chrome (ChromeDriver)
# Download from: https://chromedriver.chromium.org/
chromedriver --port=4444
# Firefox (geckodriver)
# Download from: https://github.com/mozilla/geckodriver
geckodriver --port=4444
```
### 2. Verify WebDriver is Running
```bash
curl http://localhost:4444/status
```
## Usage Examples
### Basic Usage (Functional API)
```typescript
import { createSession, navigate, click, screenshot, closeSession } from './lib/browser-client';
async function example() {
// Create session
const { session_id } = await createSession({
headless: true,
browserType: 'chrome',
});
try {
// Navigate
await navigate(session_id, 'https://example.com');
// Click element
await click(session_id, 'button.submit');
// Take screenshot
const { base64 } = await screenshot(session_id);
console.log('Screenshot taken, size:', base64.length);
} finally {
// Always close session
await closeSession(session_id);
}
}
```
### Using Browser Class (Recommended)
```typescript
import Browser from './lib/browser-client';
async function scrapeData() {
const browser = new Browser();
try {
// Start browser
await browser.start({ headless: true });
// Navigate
await browser.goto('https://example.com/products');
// Wait for products to load
await browser.wait('.product-list', 5000);
// Scrape product data
const data = await browser.scrape(
['.product-name', '.product-price', '.product-description'],
'.product-list'
);
console.log('Products:', data);
} finally {
await browser.close();
}
}
```
### Form Filling
```typescript
import Browser from './lib/browser-client';
async function fillForm() {
const browser = new Browser();
try {
await browser.start();
await browser.goto('https://example.com/login');
// Fill login form
await browser.fillForm([
{ selector: 'input[name="email"]', value: 'user@example.com' },
{ selector: 'input[name="password"]', value: 'password123' },
], 'button[type="submit"]');
// Wait for redirect
await browser.wait('.dashboard', 5000);
// Take screenshot of logged-in state
const { base64 } = await browser.screenshot();
} finally {
await browser.close();
}
}
```
### Integration with Hands System
```typescript
// In your Hand implementation
import Browser from '../lib/browser-client';
export class BrowserHand implements Hand {
name = 'browser';
description = 'Automates web browser interactions';
async execute(task: BrowserTask): Promise<HandResult> {
const browser = new Browser();
try {
await browser.start({ headless: true });
switch (task.action) {
case 'scrape':
await browser.goto(task.url);
return { success: true, data: await browser.scrape(task.selectors) };
case 'screenshot':
await browser.goto(task.url);
return { success: true, data: await browser.screenshot() };
case 'interact':
await browser.goto(task.url);
for (const step of task.steps) {
if (step.type === 'click') await browser.click(step.selector);
if (step.type === 'type') await browser.type(step.selector, step.value);
}
return { success: true };
default:
return { success: false, error: 'Unknown action' };
}
} finally {
await browser.close();
}
}
}
```
## API Reference
### Session Management
| Function | Description |
|----------|-------------|
| `createSession(options)` | Create new browser session |
| `closeSession(sessionId)` | Close browser session |
| `listSessions()` | List all active sessions |
| `getSession(sessionId)` | Get session info |
### Navigation
| Function | Description |
|----------|-------------|
| `navigate(sessionId, url)` | Navigate to URL |
| `back(sessionId)` | Go back |
| `forward(sessionId)` | Go forward |
| `refresh(sessionId)` | Refresh page |
| `getCurrentUrl(sessionId)` | Get current URL |
| `getTitle(sessionId)` | Get page title |
### Element Operations
| Function | Description |
|----------|-------------|
| `findElement(sessionId, selector)` | Find single element |
| `findElements(sessionId, selector)` | Find multiple elements |
| `click(sessionId, selector)` | Click element |
| `typeText(sessionId, selector, text, clearFirst?)` | Type into element |
| `getText(sessionId, selector)` | Get element text |
| `getAttribute(sessionId, selector, attr)` | Get element attribute |
| `waitForElement(sessionId, selector, timeout?)` | Wait for element |
### Advanced
| Function | Description |
|----------|-------------|
| `executeScript(sessionId, script, args?)` | Execute JavaScript |
| `screenshot(sessionId)` | Take page screenshot |
| `elementScreenshot(sessionId, selector)` | Take element screenshot |
| `getSource(sessionId)` | Get page HTML source |
### High-Level
| Function | Description |
|----------|-------------|
| `scrapePage(sessionId, selectors, waitFor?, timeout?)` | Scrape multiple selectors |
| `fillForm(sessionId, fields, submitSelector?)` | Fill and submit form |
## Configuration
### Environment Variables
```bash
# WebDriver URL (default: http://localhost:4444)
WEBDRIVER_URL=http://localhost:4444
```
### Session Options
```typescript
interface SessionOptions {
webdriverUrl?: string; // WebDriver server URL
headless?: boolean; // Run headless (default: true)
browserType?: 'chrome' | 'firefox' | 'edge' | 'safari';
windowWidth?: number; // Window width in pixels
windowHeight?: number; // Window height in pixels
}
```
## Troubleshooting
### WebDriver Not Found
```
Error: WebDriver connection failed
```
**Solution**: Ensure ChromeDriver or geckodriver is running:
```bash
chromedriver --port=4444
# or
geckodriver --port=4444
```
### Element Not Found
```
Error: Element not found: .my-selector
```
**Solution**: Use `waitForElement` with appropriate timeout:
```typescript
await browser.wait('.my-selector', 10000);
```
### Session Timeout
```
Error: Session not found
```
**Solution**: Session may have expired. Create a new session.
## Future Enhancements
- [ ] WebDriver auto-detection and management
- [ ] Built-in ChromeDriver bundling
- [ ] Lightpanda integration for high-performance scenarios
- [ ] WebMCP integration for Chrome 146+ features
- [ ] Screenshot diff comparison
- [ ] Network request interception
- [ ] Cookie and storage management