Files

iven f4efc823e2 refactor(types): comprehensive TypeScript type system improvements

Major type system refactoring and error fixes across the codebase:

**Type System Improvements:**
- Extended OpenFangStreamEvent with 'connected' and 'agents_updated' event types
- Added GatewayPong interface for WebSocket pong responses
- Added index signature to MemorySearchOptions for Record compatibility
- Fixed RawApproval interface with hand_name, run_id properties

**Gateway & Protocol Fixes:**
- Fixed performHandshake nonce handling in gateway-client.ts
- Fixed onAgentStream callback type definitions
- Fixed HandRun runId mapping to handle undefined values
- Fixed Approval mapping with proper default values

**Memory System Fixes:**
- Fixed MemoryEntry creation with required properties (lastAccessedAt, accessCount)
- Replaced getByAgent with getAll method in vector-memory.ts
- Fixed MemorySearchOptions type compatibility

**Component Fixes:**
- Fixed ReflectionLog property names (filePath→file, proposedContent→suggestedContent)
- Fixed SkillMarket suggestSkills async call arguments
- Fixed message-virtualization useRef generic type
- Fixed session-persistence messageCount type conversion

**Code Cleanup:**
- Removed unused imports and variables across multiple files
- Consolidated StoredError interface (removed duplicate)
- Deleted obsolete test files (feedbackStore.test.ts, memory-index.test.ts)

**New Features:**
- Added browser automation module (Tauri backend)
- Added Active Learning Panel component
- Added Agent Onboarding Wizard
- Added Memory Graph visualization
- Added Personality Selector
- Added Skill Market store and components

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-17 08:05:07 +08:00

10 KiB

Raw Blame History

Browser Automation Integration Guide

Overview

ZCLAW now includes browser automation capabilities powered by Fantoccini (Rust WebDriver client). This enables the Browser Hand to automate web browsers for testing, scraping, and automation tasks.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      Frontend (React)                        │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  browser-client.ts                                   │   │
│  │  - createSession() / closeSession()                 │   │
│  │  - navigate() / click() / type()                    │   │
│  │  - screenshot() / scrapePage()                      │   │
│  └─────────────────────┬───────────────────────────────┘   │
└────────────────────────┼────────────────────────────────────┘
                         │ Tauri invoke()
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                    Tauri Backend (Rust)                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  browser/commands.rs                                 │   │
│  │  - Tauri command handlers                           │   │
│  └─────────────────────┬───────────────────────────────┘   │
│                        │                                    │
│  ┌─────────────────────▼───────────────────────────────┐   │
│  │  browser/client.rs                                   │   │
│  │  - BrowserClient (WebDriver connection)             │   │
│  │  - Session management                               │   │
│  │  - Element operations                               │   │
│  └─────────────────────┬───────────────────────────────┘   │
│                        │                                    │
│  ┌─────────────────────▼───────────────────────────────┐   │
│  │  Fantoccini (WebDriver Protocol)                     │   │
│  └─────────────────────┬───────────────────────────────┘   │
└────────────────────────┼────────────────────────────────────┘
                         │ WebDriver Protocol
                         ▼
┌─────────────────────────────────────────────────────────────┐
│              ChromeDriver / GeckoDriver                      │
│              (Requires separate installation)                │
└─────────────────────────────────────────────────────────────┘

Prerequisites

1. Install WebDriver

You need a WebDriver installed and running:

# Chrome (ChromeDriver)
# Download from: https://chromedriver.chromium.org/
chromedriver --port=4444

# Firefox (geckodriver)
# Download from: https://github.com/mozilla/geckodriver
geckodriver --port=4444

2. Verify WebDriver is Running

curl http://localhost:4444/status

Usage Examples

Basic Usage (Functional API)

import { createSession, navigate, click, screenshot, closeSession } from './lib/browser-client';

async function example() {
  // Create session
  const { session_id } = await createSession({
    headless: true,
    browserType: 'chrome',
  });

  try {
    // Navigate
    await navigate(session_id, 'https://example.com');

    // Click element
    await click(session_id, 'button.submit');

    // Take screenshot
    const { base64 } = await screenshot(session_id);
    console.log('Screenshot taken, size:', base64.length);

  } finally {
    // Always close session
    await closeSession(session_id);
  }
}

Using Browser Class (Recommended)

import Browser from './lib/browser-client';

async function scrapeData() {
  const browser = new Browser();

  try {
    // Start browser
    await browser.start({ headless: true });

    // Navigate
    await browser.goto('https://example.com/products');

    // Wait for products to load
    await browser.wait('.product-list', 5000);

    // Scrape product data
    const data = await browser.scrape(
      ['.product-name', '.product-price', '.product-description'],
      '.product-list'
    );

    console.log('Products:', data);

  } finally {
    await browser.close();
  }
}

Form Filling

import Browser from './lib/browser-client';

async function fillForm() {
  const browser = new Browser();

  try {
    await browser.start();
    await browser.goto('https://example.com/login');

    // Fill login form
    await browser.fillForm([
      { selector: 'input[name="email"]', value: 'user@example.com' },
      { selector: 'input[name="password"]', value: 'password123' },
    ], 'button[type="submit"]');

    // Wait for redirect
    await browser.wait('.dashboard', 5000);

    // Take screenshot of logged-in state
    const { base64 } = await browser.screenshot();

  } finally {
    await browser.close();
  }
}

Integration with Hands System

// In your Hand implementation
import Browser from '../lib/browser-client';

export class BrowserHand implements Hand {
  name = 'browser';
  description = 'Automates web browser interactions';

  async execute(task: BrowserTask): Promise<HandResult> {
    const browser = new Browser();

    try {
      await browser.start({ headless: true });

      switch (task.action) {
        case 'scrape':
          await browser.goto(task.url);
          return { success: true, data: await browser.scrape(task.selectors) };

        case 'screenshot':
          await browser.goto(task.url);
          return { success: true, data: await browser.screenshot() };

        case 'interact':
          await browser.goto(task.url);
          for (const step of task.steps) {
            if (step.type === 'click') await browser.click(step.selector);
            if (step.type === 'type') await browser.type(step.selector, step.value);
          }
          return { success: true };

        default:
          return { success: false, error: 'Unknown action' };
      }
    } finally {
      await browser.close();
    }
  }
}

API Reference

Session Management

Function	Description
`createSession(options)`	Create new browser session
`closeSession(sessionId)`	Close browser session
`listSessions()`	List all active sessions
`getSession(sessionId)`	Get session info

Function	Description
`navigate(sessionId, url)`	Navigate to URL
`back(sessionId)`	Go back
`forward(sessionId)`	Go forward
`refresh(sessionId)`	Refresh page
`getCurrentUrl(sessionId)`	Get current URL
`getTitle(sessionId)`	Get page title

Element Operations

Function	Description
`findElement(sessionId, selector)`	Find single element
`findElements(sessionId, selector)`	Find multiple elements
`click(sessionId, selector)`	Click element
`typeText(sessionId, selector, text, clearFirst?)`	Type into element
`getText(sessionId, selector)`	Get element text
`getAttribute(sessionId, selector, attr)`	Get element attribute
`waitForElement(sessionId, selector, timeout?)`	Wait for element

Advanced

Function	Description
`executeScript(sessionId, script, args?)`	Execute JavaScript
`screenshot(sessionId)`	Take page screenshot
`elementScreenshot(sessionId, selector)`	Take element screenshot
`getSource(sessionId)`	Get page HTML source

High-Level

Function	Description
`scrapePage(sessionId, selectors, waitFor?, timeout?)`	Scrape multiple selectors
`fillForm(sessionId, fields, submitSelector?)`	Fill and submit form

Configuration

Environment Variables

# WebDriver URL (default: http://localhost:4444)
WEBDRIVER_URL=http://localhost:4444

Session Options

interface SessionOptions {
  webdriverUrl?: string;      // WebDriver server URL
  headless?: boolean;         // Run headless (default: true)
  browserType?: 'chrome' | 'firefox' | 'edge' | 'safari';
  windowWidth?: number;       // Window width in pixels
  windowHeight?: number;      // Window height in pixels
}

Troubleshooting

WebDriver Not Found

Error: WebDriver connection failed

Solution: Ensure ChromeDriver or geckodriver is running:

chromedriver --port=4444
# or
geckodriver --port=4444

Element Not Found

Error: Element not found: .my-selector

Solution: Use waitForElement with appropriate timeout:

await browser.wait('.my-selector', 10000);

Session Timeout

Error: Session not found

Solution: Session may have expired. Create a new session.

Future Enhancements

WebDriver auto-detection and management
Built-in ChromeDriver bundling
Lightpanda integration for high-performance scenarios
WebMCP integration for Chrome 146+ features
Screenshot diff comparison
Network request interception
Cookie and storage management

10 KiB Raw Blame History