# OpenFang Security Architecture This document provides a comprehensive technical reference for every security system in the OpenFang Agent Operating System. All struct names, function signatures, constant values, and algorithm descriptions are drawn directly from the source code. --- ## Table of Contents 1. [Security Overview](#1-security-overview) 2. [Capability-Based Security](#2-capability-based-security) 3. [WASM Dual Metering](#3-wasm-dual-metering) 4. [Merkle Hash Chain Audit Trail](#4-merkle-hash-chain-audit-trail) 5. [Information Flow Taint Tracking](#5-information-flow-taint-tracking) 6. [Ed25519 Manifest Signing](#6-ed25519-manifest-signing) 7. [SSRF Protection](#7-ssrf-protection) 8. [Secret Zeroization](#8-secret-zeroization) 9. [OFP Mutual Authentication](#9-ofp-mutual-authentication) 10. [Security Headers](#10-security-headers) 11. [GCRA Rate Limiter](#11-gcra-rate-limiter) 12. [Path Traversal Prevention](#12-path-traversal-prevention) 13. [Subprocess Sandbox](#13-subprocess-sandbox) 14. [Prompt Injection Scanner](#14-prompt-injection-scanner) 15. [Loop Guard](#15-loop-guard) 16. [Session Repair](#16-session-repair) 17. [Health Endpoint Redaction](#17-health-endpoint-redaction) 18. [Security Configuration](#18-security-configuration) 19. [Security Dependencies](#19-security-dependencies) --- ## 1. Security Overview OpenFang implements **defense-in-depth** security. No single mechanism is trusted to be the sole protector; instead, 16 independent systems form overlapping layers so that a failure in any one layer is caught by others. | # | System | Crate | Protects Against | |---|--------|-------|------------------| | 1 | Capability-Based Security | `openfang-types` | Unauthorized actions by agents | | 2 | WASM Dual Metering | `openfang-runtime` | Infinite loops, CPU DoS | | 3 | Merkle Audit Trail | `openfang-runtime` | Tampered audit logs | | 4 | Taint Tracking | `openfang-types` | Prompt injection, data exfiltration | | 5 | Ed25519 Manifest Signing | `openfang-types` | Supply chain attacks | | 6 | SSRF Protection | `openfang-runtime` | Server-Side Request Forgery | | 7 | Secret Zeroization | `openfang-runtime`, `openfang-channels` | Memory forensics, key leakage | | 8 | OFP Mutual Auth | `openfang-wire` | Unauthorized peer connections | | 9 | Security Headers | `openfang-api` | XSS, clickjacking, MIME sniffing | | 10 | GCRA Rate Limiter | `openfang-api` | API abuse, denial of service | | 11 | Path Traversal Prevention | `openfang-runtime` | Directory traversal attacks | | 12 | Subprocess Sandbox | `openfang-runtime` | Secret leakage via child processes | | 13 | Prompt Injection Scanner | `openfang-skills` | Malicious skill prompts | | 14 | Loop Guard | `openfang-runtime` | Stuck agent tool loops | | 15 | Session Repair | `openfang-runtime` | Corrupted LLM conversation history | | 16 | Health Endpoint Redaction | `openfang-api` | Information leakage | --- ## 2. Capability-Based Security **Source:** `openfang-types/src/capability.rs` OpenFang uses capability-based security. An agent can only perform actions it has been explicitly granted permission to do. Capabilities are immutable after agent creation and are enforced at the kernel level. ### 2.1 Capability Variants The `Capability` enum defines every permission type: ```rust pub enum Capability { // Filesystem FileRead(String), // Glob pattern, e.g. "/data/*" FileWrite(String), // Network NetConnect(String), // Host:port pattern, e.g. "*.openai.com:443" NetListen(u16), // Tools ToolInvoke(String), // Specific tool ID ToolAll, // All tools (dangerous) // LLM LlmQuery(String), LlmMaxTokens(u64), // Agent interaction AgentSpawn, AgentMessage(String), AgentKill(String), // Memory MemoryRead(String), MemoryWrite(String), // Shell ShellExec(String), EnvRead(String), // OFP Wire Protocol OfpDiscover, OfpConnect(String), OfpAdvertise, // Economic EconSpend(f64), EconEarn, EconTransfer(String), } ``` ### 2.2 Pattern Matching The `capability_matches(granted, required)` function implements glob-style matching: - **Exact match:** `"api.openai.com:443"` matches `"api.openai.com:443"` - **Full wildcard:** `"*"` matches anything - **Prefix wildcard:** `"*.openai.com:443"` matches `"api.openai.com:443"` - **Suffix wildcard:** `"api.*"` matches `"api.openai.com"` - **Middle wildcard:** `"api.*.com"` matches `"api.openai.com"` - **ToolAll special case:** `ToolAll` grants any `ToolInvoke(_)` - **Numeric bounds:** `LlmMaxTokens(10000)` grants `LlmMaxTokens(5000)` (granted >= required) ### 2.3 Enforcement Point In the WASM sandbox, every host call is checked **before** execution by `check_capability()` in `host_functions.rs`: ```rust fn check_capability( capabilities: &[Capability], required: &Capability, ) -> Result<(), serde_json::Value> { for granted in capabilities { if capability_matches(granted, required) { return Ok(()); } } Err(json!({"error": format!("Capability denied: {required:?}")})) } ``` If no granted capability matches the required one, the operation returns a JSON error immediately -- the tool is never invoked. ### 2.4 Capability Inheritance When an agent spawns a child agent, `validate_capability_inheritance()` ensures the child's capabilities are a **subset** of the parent's. This prevents privilege escalation: ```rust pub fn validate_capability_inheritance( parent_caps: &[Capability], child_caps: &[Capability], ) -> Result<(), String> { for child_cap in child_caps { let is_covered = parent_caps .iter() .any(|parent_cap| capability_matches(parent_cap, child_cap)); if !is_covered { return Err(format!( "Privilege escalation denied: child requests {:?} \ but parent does not have a matching grant", child_cap )); } } Ok(()) } ``` The `host_agent_spawn()` function in `host_functions.rs` calls `kernel.spawn_agent_checked(manifest_toml, Some(&state.agent_id), &state.capabilities)` which invokes this validation before the child is created. --- ## 3. WASM Dual Metering **Source:** `openfang-runtime/src/sandbox.rs` Untrusted WASM modules run inside a Wasmtime sandbox with **two independent** metering mechanisms running simultaneously. ### 3.1 Fuel Metering (Deterministic) Fuel metering counts WASM instructions. The engine deducts fuel for every instruction executed. When the budget is exhausted, execution traps with `Trap::OutOfFuel`. ```rust // SandboxConfig defaults pub fuel_limit: u64, // Default: 1_000_000 // Applied at execution time if config.fuel_limit > 0 { store.set_fuel(config.fuel_limit)?; } ``` After execution, fuel consumed is reported: ```rust let fuel_remaining = store.get_fuel().unwrap_or(0); let fuel_consumed = config.fuel_limit.saturating_sub(fuel_remaining); ``` ### 3.2 Epoch Interruption (Wall-Clock) A watchdog thread sleeps for the configured timeout, then increments the engine epoch. When the epoch advances past the store's deadline, execution traps with `Trap::Interrupt`. ```rust store.set_epoch_deadline(1); let engine_clone = engine.clone(); let timeout = config.timeout_secs.unwrap_or(30); let _watchdog = std::thread::spawn(move || { std::thread::sleep(std::time::Duration::from_secs(timeout)); engine_clone.increment_epoch(); }); ``` ### 3.3 Why Both? | Property | Fuel | Epoch | |----------|------|-------| | **Metric** | Instruction count | Wall-clock time | | **Precision** | Deterministic, reproducible | Non-deterministic | | **Catches** | CPU-intensive loops | Host call blocking, I/O waits | | **Evasion** | Can waste time in host calls | Can busy-loop cheaply | Together they form a complete defense: fuel catches compute-intensive loops, while epochs catch host-call abuse or environmental slowdowns. ### 3.4 SandboxConfig ```rust pub struct SandboxConfig { pub fuel_limit: u64, // Default: 1_000_000 pub max_memory_bytes: usize, // Default: 16 MB pub capabilities: Vec, pub timeout_secs: Option, // Default: 30 seconds } ``` ### 3.5 Error Types ```rust pub enum SandboxError { Compilation(String), Instantiation(String), Execution(String), FuelExhausted, // Trap::OutOfFuel AbiError(String), } ``` --- ## 4. Merkle Hash Chain Audit Trail **Source:** `openfang-runtime/src/audit.rs` Every security-critical action is appended to a tamper-evident Merkle hash chain, similar to a blockchain. Each entry contains the SHA-256 hash of its own contents concatenated with the hash of the previous entry. ### 4.1 Auditable Actions ```rust pub enum AuditAction { ToolInvoke, CapabilityCheck, AgentSpawn, AgentKill, AgentMessage, MemoryAccess, FileAccess, NetworkAccess, ShellExec, AuthAttempt, WireConnect, ConfigChange, } ``` ### 4.2 Entry Structure ```rust pub struct AuditEntry { pub seq: u64, // Monotonically increasing sequence number pub timestamp: String, // ISO-8601 pub agent_id: String, pub action: AuditAction, pub detail: String, // e.g. tool name, file path pub outcome: String, // "ok", "denied", error message pub prev_hash: String, // SHA-256 of previous entry (or 64 zeros) pub hash: String, // SHA-256 of this entry + prev_hash } ``` ### 4.3 Hash Computation Each entry's hash is computed from all of its fields concatenated with the previous entry's hash: ```rust fn compute_entry_hash( seq: u64, timestamp: &str, agent_id: &str, action: &AuditAction, detail: &str, outcome: &str, prev_hash: &str, ) -> String { let mut hasher = Sha256::new(); hasher.update(seq.to_string().as_bytes()); hasher.update(timestamp.as_bytes()); hasher.update(agent_id.as_bytes()); hasher.update(action.to_string().as_bytes()); hasher.update(detail.as_bytes()); hasher.update(outcome.as_bytes()); hasher.update(prev_hash.as_bytes()); hex::encode(hasher.finalize()) } ``` ### 4.4 Chain Integrity Verification `AuditLog::verify_integrity()` walks the entire chain and recomputes every hash. If any entry has been tampered with, the recomputed hash will not match the stored hash, or the `prev_hash` linkage will be broken: ```rust pub fn verify_integrity(&self) -> Result<(), String> { let entries = self.entries.lock().unwrap_or_else(|e| e.into_inner()); let mut expected_prev = "0".repeat(64); // Genesis sentinel for entry in entries.iter() { if entry.prev_hash != expected_prev { return Err(format!( "chain break at seq {}: expected prev_hash {} but found {}", entry.seq, expected_prev, entry.prev_hash )); } let recomputed = compute_entry_hash(/* ... */); if recomputed != entry.hash { return Err(format!( "hash mismatch at seq {}: expected {} but found {}", entry.seq, recomputed, entry.hash )); } expected_prev = entry.hash.clone(); } Ok(()) } ``` ### 4.5 Thread Safety `AuditLog` uses `Mutex>` and `Mutex` for the tip hash. Both locks use `unwrap_or_else(|e| e.into_inner())` to recover from poisoned mutexes, ensuring the audit log remains available even after a panic. ### 4.6 API | Method | Description | |--------|-------------| | `AuditLog::new()` | Creates an empty log with genesis sentinel (`"0" * 64`) | | `record(agent_id, action, detail, outcome)` | Appends an entry, returns its hash | | `verify_integrity()` | Validates the entire chain | | `tip_hash()` | Returns the hash of the most recent entry | | `len()` / `is_empty()` | Entry count | | `recent(n)` | Returns the most recent `n` entries (cloned) | --- ## 5. Information Flow Taint Tracking **Source:** `openfang-types/src/taint.rs` OpenFang implements a lattice-based taint propagation model that prevents tainted values from flowing into sensitive sinks without explicit declassification. This guards against prompt injection, data exfiltration, and confused-deputy attacks. ### 5.1 Taint Labels ```rust pub enum TaintLabel { ExternalNetwork, // Data from external network requests UserInput, // Direct user input Pii, // Personally identifiable information Secret, // API keys, tokens, passwords UntrustedAgent, // Data from sandboxed/untrusted agents } ``` ### 5.2 Tainted Values ```rust pub struct TaintedValue { pub value: String, // The payload pub labels: HashSet, // Attached taint labels pub source: String, // Human-readable origin } ``` Key methods: | Method | Description | |--------|-------------| | `TaintedValue::new(value, labels, source)` | Create with labels | | `TaintedValue::clean(value, source)` | Create with no labels (untainted) | | `merge_taint(&mut self, other)` | Union of labels (for concatenation) | | `check_sink(&self, sink)` | Check if value can flow to sink | | `declassify(&mut self, label)` | Remove a specific label (explicit security decision) | | `is_tainted(&self) -> bool` | True if any labels present | ### 5.3 Taint Sinks A `TaintSink` defines which labels are **blocked** from reaching it: | Sink | Blocked Labels | Rationale | |------|---------------|-----------| | `TaintSink::shell_exec()` | `ExternalNetwork`, `UntrustedAgent`, `UserInput` | Prevents command injection | | `TaintSink::net_fetch()` | `Secret`, `Pii` | Prevents data exfiltration | | `TaintSink::agent_message()` | `Secret` | Prevents secret leakage to other agents | ### 5.4 Violation Handling When `check_sink()` finds a blocked label, it returns a `TaintViolation`: ```rust pub struct TaintViolation { pub label: TaintLabel, // The offending label pub sink_name: String, // "shell_exec", "net_fetch", etc. pub source: String, // Where the tainted value came from } ``` Display: `taint violation: label 'Secret' from source 'env_var' is not allowed to reach sink 'net_fetch'` ### 5.5 Declassification Declassification is an **explicit security decision**. The caller asserts that the value has been sanitized: ```rust tainted.declassify(&TaintLabel::ExternalNetwork); tainted.declassify(&TaintLabel::UserInput); // After declassification, value can flow to shell_exec assert!(tainted.check_sink(&TaintSink::shell_exec()).is_ok()); ``` ### 5.6 Taint Propagation When two values are combined (concatenation, interpolation), the result must carry the union of both label sets: ```rust let mut combined = TaintedValue::new(/* ... */); combined.merge_taint(&other_value); // combined.labels is now the union of both ``` --- ## 6. Ed25519 Manifest Signing **Source:** `openfang-types/src/manifest_signing.rs` Agent manifests define an agent's capabilities, tools, and configuration. A compromised manifest can grant elevated privileges. This module provides Ed25519-based cryptographic signing. ### 6.1 Signing Scheme 1. Compute SHA-256 of the manifest content (raw TOML text). 2. Sign the hash with Ed25519 (via `ed25519-dalek`). 3. Bundle the signature, public key, and content hash into a `SignedManifest` envelope. ### 6.2 SignedManifest Structure ```rust pub struct SignedManifest { pub manifest: String, // Raw TOML content pub content_hash: String, // Hex SHA-256 of manifest pub signature: Vec, // Ed25519 signature (64 bytes) pub signer_public_key: Vec, // Ed25519 public key (32 bytes) pub signer_id: String, // Human-readable signer ID } ``` ### 6.3 Signing ```rust let signing_key = SigningKey::generate(&mut OsRng); let signed = SignedManifest::sign(manifest_toml, &signing_key, "admin@org.com"); ``` Internally: ```rust pub fn sign(manifest: impl Into, signing_key: &SigningKey, signer_id: impl Into) -> Self { let manifest = manifest.into(); let content_hash = hash_manifest(&manifest); // SHA-256 let signature = signing_key.sign(content_hash.as_bytes()); let verifying_key = signing_key.verifying_key(); Self { manifest, content_hash, signature: signature.to_bytes().to_vec(), signer_public_key: verifying_key.to_bytes().to_vec(), signer_id: signer_id.into(), } } ``` ### 6.4 Verification Two-phase verification: 1. **Hash check:** Recompute SHA-256 of `manifest` and compare to `content_hash`. 2. **Signature check:** Verify the Ed25519 signature over `content_hash` using `signer_public_key`. ```rust pub fn verify(&self) -> Result<(), String> { let recomputed = hash_manifest(&self.manifest); if recomputed != self.content_hash { return Err("content hash mismatch: ..."); } let verifying_key = VerifyingKey::from_bytes(&pk_bytes)?; let signature = Signature::from_bytes(&sig_bytes); verifying_key.verify(self.content_hash.as_bytes(), &signature) .map_err(|e| format!("signature verification failed: {}", e)) } ``` ### 6.5 Tamper Detection - Modifying the manifest content after signing causes a **content hash mismatch**. - Replacing the public key with a different key causes a **signature verification failure**. - Both attacks are caught by `verify()`. --- ## 7. SSRF Protection **Source:** `openfang-runtime/src/host_functions.rs` The `host_net_fetch` function (WASM host call for network requests) includes comprehensive Server-Side Request Forgery protection. ### 7.1 Scheme Validation Only `http://` and `https://` schemes are allowed. All others (`file://`, `gopher://`, `ftp://`) are blocked immediately: ```rust if !url.starts_with("http://") && !url.starts_with("https://") { return Err(json!({"error": "Only http:// and https:// URLs are allowed"})); } ``` ### 7.2 Hostname Blocklist Before DNS resolution, these hostnames are blocked: - `localhost` - `metadata.google.internal` - `metadata.aws.internal` - `instance-data` - `169.254.169.254` (AWS/GCP metadata endpoint) ### 7.3 DNS Resolution Check After the hostname blocklist, the function resolves the hostname to IP addresses and checks **every resolved IP** against private ranges. This defeats DNS rebinding attacks: ```rust let socket_addr = format!("{hostname}:{port}"); if let Ok(addrs) = socket_addr.to_socket_addrs() { for addr in addrs { let ip = addr.ip(); if ip.is_loopback() || ip.is_unspecified() || is_private_ip(&ip) { return Err(json!({"error": format!( "SSRF blocked: {hostname} resolves to private IP {ip}" )})); } } } ``` ### 7.4 Private IP Detection The `is_private_ip()` function covers: **IPv4:** - `10.0.0.0/8` -- RFC 1918 - `172.16.0.0/12` -- RFC 1918 - `192.168.0.0/16` -- RFC 1918 - `169.254.0.0/16` -- Link-local (AWS metadata) **IPv6:** - `fc00::/7` -- Unique Local Address - `fe80::/10` -- Link-local ```rust fn is_private_ip(ip: &std::net::IpAddr) -> bool { match ip { IpAddr::V4(v4) => { let octets = v4.octets(); matches!( octets, [10, ..] | [172, 16..=31, ..] | [192, 168, ..] | [169, 254, ..] ) } IpAddr::V6(v6) => { let segments = v6.segments(); (segments[0] & 0xfe00) == 0xfc00 || (segments[0] & 0xffc0) == 0xfe80 } } } ``` ### 7.5 Host Extraction `extract_host_from_url()` parses the URL to extract `host:port` for both SSRF checking and capability matching: ``` https://api.openai.com/v1/chat -> api.openai.com:443 http://localhost:8080/api -> localhost:8080 http://example.com -> example.com:80 ``` --- ## 8. Secret Zeroization **Source:** All LLM driver modules, channel adapters, and web search modules. OpenFang uses `Zeroizing` from the `zeroize` crate on every field that holds secret material. When the value is dropped, its memory is overwritten with zeros, preventing secrets from lingering in memory. ### 8.1 How It Works `Zeroizing` is a smart-pointer wrapper from the `zeroize` crate. It implements `Deref` for transparent usage and `Drop` for automatic zeroization: ```rust // On Drop, the inner String's buffer is overwritten with zeros let key = Zeroizing::new("sk-secret-key".to_string()); // Use key transparently via Deref client.post(url).header("authorization", format!("Bearer {}", &*key)); // When key goes out of scope, memory is zeroed ``` ### 8.2 Fields Using Zeroization **LLM Drivers** (`openfang-runtime/src/drivers/`): | Driver | Field | |--------|-------| | `AnthropicDriver` | `api_key: Zeroizing` | | `GeminiDriver` | `api_key: Zeroizing` | | `OpenAiCompatDriver` | `api_key: Zeroizing` | **Channel Adapters** (`openfang-channels/src/`): | Adapter | Field(s) | |---------|----------| | `DiscordAdapter` | `token: Zeroizing` | | `EmailAdapter` | `password: Zeroizing` | | `BlueskyAdapter` | `app_password: Zeroizing` | | `DingTalkAdapter` | `access_token: Zeroizing`, `secret: Zeroizing` | | `FeishuAdapter` | `app_secret: Zeroizing` | | `FlockAdapter` | `bot_token: Zeroizing` | | `GitterAdapter` | `token: Zeroizing` | | `GotifyAdapter` | `app_token: Zeroizing`, `client_token: Zeroizing` | **Web Search** (`openfang-runtime/src/web_search.rs`): ```rust fn resolve_api_key(env_var: &str) -> Option> { std::env::var(env_var).ok().filter(|k| !k.is_empty()).map(Zeroizing::new) } ``` **Embedding** (`openfang-runtime/src/embedding.rs`): | Struct | Field | |--------|-------| | `EmbeddingClient` | `api_key: Zeroizing` | ### 8.3 Why It Matters Without zeroization, secrets remain in memory after use until the OS reclaims the page. An attacker with access to a core dump, swap file, or memory forensics tool can recover API keys. `Zeroizing` ensures the secret is overwritten as soon as it is no longer needed. --- ## 9. OFP Mutual Authentication **Source:** `openfang-wire/src/peer.rs` The OpenFang Wire Protocol (OFP) uses HMAC-SHA256 with nonce-based mutual authentication over TCP connections. ### 9.1 Pre-Shared Key Requirement OFP refuses to start without a `shared_secret`: ```rust if config.shared_secret.is_empty() { return Err(WireError::HandshakeFailed( "OFP requires shared_secret. Set [network] shared_secret in config.toml".into(), )); } ``` ### 9.2 HMAC Functions ```rust type HmacSha256 = Hmac; fn hmac_sign(secret: &str, data: &[u8]) -> String { let mut mac = HmacSha256::new_from_slice(secret.as_bytes()) .expect("HMAC accepts any key size"); mac.update(data); hex::encode(mac.finalize().into_bytes()) } fn hmac_verify(secret: &str, data: &[u8], signature: &str) -> bool { let expected = hmac_sign(secret, data); subtle::ConstantTimeEq::ct_eq(expected.as_bytes(), signature.as_bytes()).into() } ``` **Constant-time comparison** (`subtle::ConstantTimeEq`) prevents timing side-channel attacks. ### 9.3 Handshake Protocol **Initiator (client):** 1. Generate a random UUID nonce. 2. Compute `auth_data = nonce + node_id`. 3. Compute `auth_hmac = hmac_sign(shared_secret, auth_data)`. 4. Send `Handshake { node_id, node_name, protocol_version, agents, nonce, auth_hmac }`. **Responder (server):** 1. Receive the `Handshake` message. 2. Verify the incoming HMAC: `hmac_verify(shared_secret, nonce + node_id, auth_hmac)`. 3. If verification fails, return error code 403. 4. Generate a new UUID nonce for the ack. 5. Compute `ack_auth_data = ack_nonce + self.node_id`. 6. Compute `ack_hmac = hmac_sign(shared_secret, ack_auth_data)`. 7. Send `HandshakeAck { node_id, node_name, protocol_version, agents, nonce: ack_nonce, auth_hmac: ack_hmac }`. **Initiator (verification):** 1. Receive `HandshakeAck`. 2. Verify: `hmac_verify(shared_secret, ack_nonce + node_id, ack_hmac)`. 3. If verification fails, return `WireError::HandshakeFailed`. ### 9.4 Security Properties | Property | How It Is Achieved | |----------|-------------------| | **Mutual authentication** | Both sides prove knowledge of the shared secret | | **Replay protection** | Random UUID nonces per handshake | | **Timing-attack resistance** | `subtle::ConstantTimeEq` for HMAC comparison | | **Mandatory secret** | OFP refuses to start with an empty `shared_secret` | | **Message size limit** | `MAX_MESSAGE_SIZE = 16 MB` prevents memory DoS | | **Protocol version check** | `PROTOCOL_VERSION` mismatch returns `WireError::VersionMismatch` | --- ## 10. Security Headers **Source:** `openfang-api/src/middleware.rs` The `security_headers` middleware is applied to **all** API responses: ```rust pub async fn security_headers(request: Request, next: Next) -> Response { let mut response = next.run(request).await; let headers = response.headers_mut(); headers.insert("x-content-type-options", "nosniff".parse().unwrap()); headers.insert("x-frame-options", "DENY".parse().unwrap()); headers.insert("x-xss-protection", "1; mode=block".parse().unwrap()); headers.insert("content-security-policy", /* CSP policy */); headers.insert("referrer-policy", "strict-origin-when-cross-origin".parse().unwrap()); headers.insert("cache-control", "no-store, no-cache, must-revalidate".parse().unwrap()); response } ``` | Header | Value | Protects Against | |--------|-------|------------------| | `X-Content-Type-Options` | `nosniff` | MIME type sniffing attacks | | `X-Frame-Options` | `DENY` | Clickjacking via iframes | | `X-XSS-Protection` | `1; mode=block` | Reflected XSS (legacy browsers) | | `Content-Security-Policy` | See below | XSS, code injection, data exfiltration | | `Referrer-Policy` | `strict-origin-when-cross-origin` | Referrer leakage | | `Cache-Control` | `no-store, no-cache, must-revalidate` | Sensitive data caching | ### 10.1 CSP Breakdown | Directive | Value | Purpose | |-----------|-------|---------| | `default-src` | `'self'` | Deny all external resources by default | | `script-src` | `'self' 'unsafe-inline' 'unsafe-eval' cdn.jsdelivr.net` | Allow scripts from self and CDN | | `style-src` | `'self' 'unsafe-inline' cdn.jsdelivr.net fonts.googleapis.com` | Allow styles from self, CDN, Google Fonts | | `img-src` | `'self' data:` | Allow images from self and data URIs | | `connect-src` | `'self' ws: wss:` | Allow WebSocket connections | | `font-src` | `'self' cdn.jsdelivr.net fonts.gstatic.com` | Allow fonts from CDN | | `object-src` | `'none'` | Block all plugins (Flash, Java, etc.) | | `base-uri` | `'self'` | Prevent base tag hijacking | | `form-action` | `'self'` | Restrict form submission targets | --- ## 11. GCRA Rate Limiter **Source:** `openfang-api/src/rate_limiter.rs` OpenFang uses the Generic Cell Rate Algorithm (GCRA) for cost-aware API rate limiting via the `governor` crate. ### 11.1 Algorithm GCRA is a leaky-bucket variant that tracks a single "virtual scheduling time" (TAT -- Theoretical Arrival Time) per key. Each request consumes a number of tokens proportional to its cost. The bucket refills at a constant rate. **Budget:** 500 tokens per minute per IP address. ```rust pub fn create_rate_limiter() -> Arc { Arc::new(RateLimiter::keyed(Quota::per_minute(NonZeroU32::new(500).unwrap()))) } ``` ### 11.2 Operation Costs Each API operation has a configurable token cost: ```rust pub fn operation_cost(method: &str, path: &str) -> NonZeroU32 { match (method, path) { (_, "/api/health") => 1, ("GET", "/api/status") => 1, ("GET", "/api/version") => 1, ("GET", "/api/tools") => 1, ("GET", "/api/agents") => 2, ("GET", "/api/skills") => 2, ("GET", "/api/peers") => 2, ("GET", "/api/config") => 2, ("GET", "/api/usage") => 3, ("GET", p) if p.starts_with("/api/audit") => 5, ("GET", p) if p.starts_with("/api/marketplace")=> 10, ("POST", "/api/agents") => 50, ("POST", p) if p.contains("/message") => 30, ("POST", p) if p.contains("/run") => 100, ("POST", "/api/skills/install") => 50, ("POST", "/api/skills/uninstall") => 10, ("POST", "/api/migrate") => 100, ("PUT", p) if p.contains("/update") => 10, _ => 5, } } ``` The cost hierarchy is intentional: read-only health checks cost 1 token while expensive operations like workflow runs cost 100, meaning a client can perform 500 health checks per minute but only 5 workflow runs. ### 11.3 Middleware ```rust pub async fn gcra_rate_limit( State(limiter): State>, request: Request, next: Next, ) -> Response { let ip = /* extract from ConnectInfo, default 127.0.0.1 */; let cost = operation_cost(&method, &path); if limiter.check_key_n(&ip, cost).is_err() { tracing::warn!(ip, cost, path, "GCRA rate limit exceeded"); return Response::builder() .status(StatusCode::TOO_MANY_REQUESTS) .header("retry-after", "60") .body(/* JSON error */) .unwrap_or_default(); } next.run(request).await } ``` ### 11.4 Rate Limiter Type ```rust pub type KeyedRateLimiter = RateLimiter, DefaultClock>; ``` The `DashMapStateStore` provides concurrent per-IP state with automatic stale entry cleanup. --- ## 12. Path Traversal Prevention **Source:** `openfang-runtime/src/host_functions.rs` Two functions provide defense-in-depth against directory traversal. ### 12.1 safe_resolve_path (for reads) Used for `fs_read` and `fs_list` operations where the target file must exist: ```rust fn safe_resolve_path(path: &str) -> Result { let p = Path::new(path); // Phase 1: Reject any path with ".." components for component in p.components() { if matches!(component, Component::ParentDir) { return Err(json!({"error": "Path traversal denied: '..' components forbidden"})); } } // Phase 2: Canonicalize to resolve symlinks and normalize std::fs::canonicalize(p) .map_err(|e| json!({"error": format!("Cannot resolve path: {e}")})) } ``` ### 12.2 safe_resolve_parent (for writes) Used for `fs_write` operations where the target file may not exist yet: ```rust fn safe_resolve_parent(path: &str) -> Result { let p = Path::new(path); // Phase 1: Reject ".." in any component for component in p.components() { if matches!(component, Component::ParentDir) { return Err(json!({"error": "Path traversal denied: '..' components forbidden"})); } } // Phase 2: Canonicalize the parent directory let parent = p.parent().filter(|par| !par.as_os_str().is_empty()) .ok_or_else(|| json!({"error": "Invalid path: no parent directory"}))?; let canonical_parent = std::fs::canonicalize(parent)?; // Phase 3: Belt-and-suspenders check on filename let file_name = p.file_name() .ok_or_else(|| json!({"error": "Invalid path: no file name"}))?; if file_name.to_string_lossy().contains("..") { return Err(json!({"error": "Path traversal denied in file name"})); } Ok(canonical_parent.join(file_name)) } ``` ### 12.3 Enforcement Order 1. **Capability check** runs first with the raw path. 2. **Path traversal check** runs second. 3. **Operation** runs only if both pass. This ordering ensures that even if a capability is misconfigured with a broad pattern like `"*"`, path traversal is still blocked. --- ## 13. Subprocess Sandbox **Source:** `openfang-runtime/src/subprocess_sandbox.rs` When the runtime spawns child processes (e.g., for the shell tool or skill execution), the inherited environment must be stripped to prevent accidental leakage of secrets. ### 13.1 Environment Clearing ```rust pub fn sandbox_command(cmd: &mut tokio::process::Command, allowed_env_vars: &[String]) { cmd.env_clear(); // Remove ALL inherited env vars // Re-add platform-independent safe vars for var in SAFE_ENV_VARS { if let Ok(val) = std::env::var(var) { cmd.env(var, val); } } // Re-add Windows-specific safe vars (on Windows) #[cfg(windows)] for var in SAFE_ENV_VARS_WINDOWS { /* ... */ } // Re-add caller-specified allowed vars for var in allowed_env_vars { /* ... */ } } ``` ### 13.2 Safe Environment Variables **All platforms:** ```rust pub const SAFE_ENV_VARS: &[&str] = &[ "PATH", "HOME", "TMPDIR", "TMP", "TEMP", "LANG", "LC_ALL", "TERM", ]; ``` **Windows-only:** ```rust pub const SAFE_ENV_VARS_WINDOWS: &[&str] = &[ "USERPROFILE", "SYSTEMROOT", "APPDATA", "LOCALAPPDATA", "COMSPEC", "WINDIR", "PATHEXT", ]; ``` Variables not in these lists and not in `allowed_env_vars` are **never** passed to the child process. This means `OPENAI_API_KEY`, `GEMINI_API_KEY`, database credentials, and all other secrets are stripped. ### 13.3 Executable Path Validation ```rust pub fn validate_executable_path(path: &str) -> Result<(), String> { let p = Path::new(path); for component in p.components() { if let std::path::Component::ParentDir = component { return Err(format!( "executable path '{}' contains '..' component which is not allowed", path )); } } Ok(()) } ``` This prevents an agent from escaping its working directory via crafted paths like `../../bin/dangerous`. ### 13.4 Shell Injection Prevention The `host_shell_exec` function uses `Command::new(command).args(&args)` which does **not** invoke a shell. Each argument is passed directly to the process, preventing shell injection via metacharacters like `;`, `|`, `&&`. --- ## 14. Prompt Injection Scanner **Source:** `openfang-skills/src/verify.rs` The `SkillVerifier` provides two scanning functions: `security_scan()` for skill manifests and `scan_prompt_content()` for skill prompt text (SKILL.md body). ### 14.1 Manifest Security Scan `SkillVerifier::security_scan(manifest)` inspects a skill's declared requirements: | Check | Severity | Trigger | |-------|----------|---------| | Node.js runtime | Warning | `runtime_type == SkillRuntime::Node` | | Shell execution capability | Critical | Capability contains `shellexec` or `shell_exec` | | Unrestricted network | Warning | Capability contains `netconnect(*)` | | Shell tool | Critical | Tool is `shell_exec` or `bash` | | Filesystem write tool | Warning | Tool is `file_write` or `file_delete` | | Too many tools | Info | More than 10 tools required | ### 14.2 Prompt Injection Scan `SkillVerifier::scan_prompt_content(content)` detects common attack patterns in skill prompt text: **Critical -- Prompt override attempts:** ``` "ignore previous instructions", "ignore all previous", "disregard previous", "forget your instructions", "you are now", "new instructions:", "system prompt override", "ignore the above", "do not follow", "override system" ``` **Warning -- Data exfiltration patterns:** ``` "send to http", "send to https", "post to http", "post to https", "exfiltrate", "forward all", "send all data", "base64 encode and send", "upload to" ``` **Warning -- Shell command references:** ``` "rm -rf", "chmod ", "sudo " ``` **Info -- Excessive length:** Content over 50,000 bytes triggers an info-level warning about potential LLM performance degradation. ### 14.3 SHA256 Checksum Verification ```rust pub fn verify_checksum(data: &[u8], expected_sha256: &str) -> bool { let actual = Self::sha256_hex(data); actual == expected_sha256.to_lowercase() } ``` Skills installed from ClawHub have their content verified against a known SHA256 hash to detect tampering during download. ### 14.4 Warning Structure ```rust pub struct SkillWarning { pub severity: WarningSeverity, // Info, Warning, Critical pub message: String, } ``` --- ## 15. Loop Guard **Source:** `openfang-runtime/src/loop_guard.rs` The `LoopGuard` tracks tool calls within a single agent loop execution to detect when the agent is stuck calling the same tool repeatedly. ### 15.1 Configuration ```rust pub struct LoopGuardConfig { pub warn_threshold: u32, // Default: 3 pub block_threshold: u32, // Default: 5 pub global_circuit_breaker: u32, // Default: 30 } ``` ### 15.2 Detection Algorithm 1. For each tool call, compute SHA-256 of `tool_name + "|" + serialized_params`. 2. Increment the count for that hash in a `HashMap`. 3. Increment `total_calls`. 4. Return a graduated verdict: ```rust pub fn check(&mut self, tool_name: &str, params: &serde_json::Value) -> LoopGuardVerdict { self.total_calls += 1; // Global circuit breaker if self.total_calls > self.config.global_circuit_breaker { return LoopGuardVerdict::CircuitBreak(/* ... */); } let hash = Self::compute_hash(tool_name, params); let count = self.call_counts.entry(hash).or_insert(0); *count += 1; if *count >= self.config.block_threshold { LoopGuardVerdict::Block(/* ... */) } else if *count >= self.config.warn_threshold { LoopGuardVerdict::Warn(/* ... */) } else { LoopGuardVerdict::Allow } } ``` ### 15.3 Verdict Types | Verdict | Meaning | Action | |---------|---------|--------| | `Allow` | Normal operation | Run the tool | | `Warn(msg)` | Same call repeated >= 3 times | Run, append warning to result | | `Block(msg)` | Same call repeated >= 5 times | Skip execution, return error | | `CircuitBreak(msg)` | > 30 total tool calls | Terminate the entire agent loop | ### 15.4 Hash Computation ```rust fn compute_hash(tool_name: &str, params: &serde_json::Value) -> String { let mut hasher = Sha256::new(); hasher.update(tool_name.as_bytes()); hasher.update(b"|"); let params_str = serde_json::to_string(params).unwrap_or_default(); hasher.update(params_str.as_bytes()); hex::encode(hasher.finalize()) } ``` Note: `serde_json::to_string` produces deterministic output (object keys are sorted), ensuring that semantically identical parameters produce the same hash. ### 15.5 Key Property Calls with **different parameters** are tracked separately. An agent that calls `web_search` with 10 different queries will not trigger the guard, but an agent that calls `web_search({"query": "test"})` 5 times will be blocked. --- ## 16. Session Repair **Source:** `openfang-runtime/src/session_repair.rs` Before sending message history to the LLM, this module validates and repairs common structural issues that would cause API errors. ### 16.1 Three-Phase Repair ```rust pub fn validate_and_repair(messages: &[Message]) -> Vec ``` **Phase 1 -- Collect ToolUse IDs:** Scan all messages for `ContentBlock::ToolUse { id, .. }` blocks and collect their IDs into a `HashSet`. **Phase 2 -- Filter orphans and empties:** - **Orphaned ToolResults:** `ContentBlock::ToolResult { tool_use_id, .. }` blocks where `tool_use_id` is not in the ToolUse ID set are dropped. - **Empty messages:** Messages with empty text or no content blocks are dropped. **Phase 3 -- Merge consecutive same-role messages:** The Anthropic API requires strict role alternation (user, assistant, user, assistant...). If two consecutive messages have the same role, they are merged into a single message with combined content blocks. ### 16.2 Why Each Repair Is Needed | Issue | Cause | Effect Without Repair | |-------|-------|----------------------| | Orphaned ToolResult | Compaction or truncation removed the ToolUse | API error: "tool_use_id not found" | | Empty messages | Cancelled generation, empty user submission | API error: empty content | | Consecutive same-role | Manual history editing, session repair itself | API error: role alternation violation | ### 16.3 Content Merging When merging consecutive same-role messages, both are converted to block format and concatenated: ```rust fn merge_content(dst: &mut MessageContent, src: MessageContent) { let dst_blocks = content_to_blocks(std::mem::replace(dst, MessageContent::Text(String::new()))); let src_blocks = content_to_blocks(src); let mut combined = dst_blocks; combined.extend(src_blocks); *dst = MessageContent::Blocks(combined); } ``` --- ## 17. Health Endpoint Redaction **Source:** `openfang-api/src/routes.rs` OpenFang provides two health endpoints with different information levels. ### 17.1 Public Endpoint: `GET /api/health` **No authentication required.** Returns only liveness information: ```json { "status": "ok", "version": "0.1.0" } ``` This endpoint does not expose agent count, database details, configuration warnings, uptime, or any internal system information. It is suitable for load balancer health checks. ### 17.2 Detail Endpoint: `GET /api/health/detail` **Requires authentication.** Returns full diagnostics: ```json { "status": "ok", "version": "0.1.0", "uptime_seconds": 3600, "panic_count": 0, "restart_count": 2, "agent_count": 15, "database": "connected", "config_warnings": [] } ``` ### 17.3 Localhost Fallback When no API key is configured, the `auth` middleware restricts all non-health endpoints to loopback addresses only: ```rust if api_key.is_empty() { let is_loopback = request.extensions() .get::>() .map(|ci| ci.0.ip().is_loopback()) .unwrap_or(false); if !is_loopback { return Response::builder() .status(StatusCode::FORBIDDEN) .body(/* "No API key configured. Remote access denied." */) ...; } } ``` --- ## 18. Security Configuration ### 18.1 config.toml Reference ```toml # API Authentication api_key = "your-secret-api-key" # Empty = localhost-only mode # OFP Wire Protocol [network] shared_secret = "your-pre-shared-key" # Required for OFP # WASM Sandbox [sandbox] fuel_limit = 1000000 # CPU instruction budget per execution timeout_secs = 30 # Wall-clock timeout per execution max_memory_bytes = 16777216 # 16 MB max WASM memory # Rate Limiting # 500 tokens/minute/IP (not currently configurable via config.toml) # Web Search SSRF Protection [web] # SSRF protection is always on and cannot be disabled ``` ### 18.2 Environment Variables for Secrets | Variable | Used By | |----------|---------| | `OPENAI_API_KEY` | OpenAI-compat driver | | `ANTHROPIC_API_KEY` | Anthropic driver | | `GEMINI_API_KEY` or `GOOGLE_API_KEY` | Gemini driver | | `DEEPSEEK_API_KEY` | DeepSeek provider | | `GROQ_API_KEY` | Groq provider | | `BRAVE_API_KEY` | Brave web search | | `TAVILY_API_KEY` | Tavily web search | | `PERPLEXITY_API_KEY` | Perplexity web search | All environment variable API keys are wrapped in `Zeroizing` when loaded into driver structs. ### 18.3 Capability Declaration (Agent Manifest) Capabilities are declared in the agent's TOML manifest: ```toml [agent] name = "my-agent" [[capabilities]] type = "FileRead" value = "/data/*" [[capabilities]] type = "NetConnect" value = "*.openai.com:443" [[capabilities]] type = "ToolInvoke" value = "web_search" [[capabilities]] type = "LlmMaxTokens" value = 4096 ``` ### 18.4 Loop Guard Tuning The default `LoopGuardConfig` values are: | Parameter | Default | Description | |-----------|---------|-------------| | `warn_threshold` | 3 | Identical calls before warning | | `block_threshold` | 5 | Identical calls before blocking | | `global_circuit_breaker` | 30 | Total calls before circuit break | ### 18.5 Subprocess Sandbox Allowlists To pass specific environment variables to subprocesses: ```rust sandbox_command(&mut cmd, &["MY_CUSTOM_VAR".to_string()]); ``` Only variables explicitly listed in `allowed_env_vars` (plus the safe defaults) will be inherited. --- ## 19. Security Dependencies | Crate | Purpose | |-------|---------| | `sha2` | SHA-256 hashing (audit trail, loop guard, SSRF, checksums) | | `hmac` | HMAC-SHA256 for OFP authentication | | `hex` | Hex encoding/decoding of hashes and signatures | | `subtle` | Constant-time comparison (`ConstantTimeEq`) for HMAC verification | | `ed25519-dalek` | Ed25519 signing/verification for manifest signing | | `rand` | Cryptographic RNG for key generation (`OsRng`) | | `zeroize` | `Zeroizing` wrapper for automatic secret memory wiping | | `governor` | GCRA rate limiting algorithm | | `wasmtime` | WASM sandbox with fuel + epoch metering | | `uuid` | Nonce generation for OFP handshakes | | `chrono` | ISO-8601 timestamps for audit entries | | `reqwest` | HTTP client (used inside SSRF-protected `host_net_fetch`) | ### 19.1 Why These Specific Crates - **sha2/hmac:** Part of the RustCrypto project, audited, widely used in production Rust. - **ed25519-dalek:** De facto standard Ed25519 library in Rust, extensively audited. - **subtle:** Provides constant-time operations to prevent timing side-channels. - **zeroize:** Official RustCrypto approach to zeroing secrets; integrates with `Drop`. - **governor:** Battle-tested GCRA implementation with `DashMap`-backed concurrent state. --- ## Threat Model Summary | Threat | Mitigated By | |--------|-------------| | Agent requests unauthorized file access | Capability-based security (Section 2) | | Agent spawns child with elevated privileges | Capability inheritance validation (Section 2.4) | | WASM skill runs infinite loop | Dual metering: fuel + epoch (Section 3) | | Attacker tampers with audit log | Merkle hash chain (Section 4) | | Prompt injection via external data | Taint tracking (Section 5) | | Data exfiltration via LLM | Taint sinks block Secret/PII to net_fetch (Section 5.3) | | Tampered agent manifest | Ed25519 signing (Section 6) | | SSRF to cloud metadata | Private IP + hostname blocking + DNS check (Section 7) | | API key recovery from memory dump | Zeroizing (Section 8) | | Unauthorized peer-to-peer connections | HMAC-SHA256 mutual auth (Section 9) | | XSS / clickjacking on API | Security headers (Section 10) | | API brute force / DoS | GCRA rate limiter (Section 11) | | Path traversal via `../` | safe_resolve_path / safe_resolve_parent (Section 12) | | Secret leakage to child processes | env_clear() + allowlist (Section 13) | | Malicious skills from ClawHub | Prompt injection scanner + SHA256 checksum (Section 14) | | Agent stuck in tool loop | LoopGuard with graduated response (Section 15) | | Corrupted LLM session history | Session repair (Section 16) | | Information leakage from health endpoint | Redacted public endpoint (Section 17) | | Timing attacks on HMAC verification | subtle::ConstantTimeEq (Section 9.2) | | Shell injection via metacharacters | Command::new (no shell) + env_clear (Section 13.4) | | DNS rebinding for SSRF bypass | Resolved IP check, not hostname check (Section 7.3) |