Some checks failed
CI / Check / macos-latest (push) Has been cancelled
CI / Check / ubuntu-latest (push) Has been cancelled
CI / Check / windows-latest (push) Has been cancelled
CI / Test / macos-latest (push) Has been cancelled
CI / Test / ubuntu-latest (push) Has been cancelled
CI / Test / windows-latest (push) Has been cancelled
CI / Clippy (push) Has been cancelled
CI / Format (push) Has been cancelled
CI / Security Audit (push) Has been cancelled
CI / Secrets Scan (push) Has been cancelled
CI / Install Script Smoke Test (push) Has been cancelled
1491 lines
46 KiB
Markdown
1491 lines
46 KiB
Markdown
# OpenFang Security Architecture
|
|
|
|
This document provides a comprehensive technical reference for every security
|
|
system in the OpenFang Agent Operating System. All struct names, function
|
|
signatures, constant values, and algorithm descriptions are drawn directly from
|
|
the source code.
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [Security Overview](#1-security-overview)
|
|
2. [Capability-Based Security](#2-capability-based-security)
|
|
3. [WASM Dual Metering](#3-wasm-dual-metering)
|
|
4. [Merkle Hash Chain Audit Trail](#4-merkle-hash-chain-audit-trail)
|
|
5. [Information Flow Taint Tracking](#5-information-flow-taint-tracking)
|
|
6. [Ed25519 Manifest Signing](#6-ed25519-manifest-signing)
|
|
7. [SSRF Protection](#7-ssrf-protection)
|
|
8. [Secret Zeroization](#8-secret-zeroization)
|
|
9. [OFP Mutual Authentication](#9-ofp-mutual-authentication)
|
|
10. [Security Headers](#10-security-headers)
|
|
11. [GCRA Rate Limiter](#11-gcra-rate-limiter)
|
|
12. [Path Traversal Prevention](#12-path-traversal-prevention)
|
|
13. [Subprocess Sandbox](#13-subprocess-sandbox)
|
|
14. [Prompt Injection Scanner](#14-prompt-injection-scanner)
|
|
15. [Loop Guard](#15-loop-guard)
|
|
16. [Session Repair](#16-session-repair)
|
|
17. [Health Endpoint Redaction](#17-health-endpoint-redaction)
|
|
18. [Security Configuration](#18-security-configuration)
|
|
19. [Security Dependencies](#19-security-dependencies)
|
|
|
|
---
|
|
|
|
## 1. Security Overview
|
|
|
|
OpenFang implements **defense-in-depth** security. No single mechanism is
|
|
trusted to be the sole protector; instead, 16 independent systems form
|
|
overlapping layers so that a failure in any one layer is caught by others.
|
|
|
|
| # | System | Crate | Protects Against |
|
|
|---|--------|-------|------------------|
|
|
| 1 | Capability-Based Security | `openfang-types` | Unauthorized actions by agents |
|
|
| 2 | WASM Dual Metering | `openfang-runtime` | Infinite loops, CPU DoS |
|
|
| 3 | Merkle Audit Trail | `openfang-runtime` | Tampered audit logs |
|
|
| 4 | Taint Tracking | `openfang-types` | Prompt injection, data exfiltration |
|
|
| 5 | Ed25519 Manifest Signing | `openfang-types` | Supply chain attacks |
|
|
| 6 | SSRF Protection | `openfang-runtime` | Server-Side Request Forgery |
|
|
| 7 | Secret Zeroization | `openfang-runtime`, `openfang-channels` | Memory forensics, key leakage |
|
|
| 8 | OFP Mutual Auth | `openfang-wire` | Unauthorized peer connections |
|
|
| 9 | Security Headers | `openfang-api` | XSS, clickjacking, MIME sniffing |
|
|
| 10 | GCRA Rate Limiter | `openfang-api` | API abuse, denial of service |
|
|
| 11 | Path Traversal Prevention | `openfang-runtime` | Directory traversal attacks |
|
|
| 12 | Subprocess Sandbox | `openfang-runtime` | Secret leakage via child processes |
|
|
| 13 | Prompt Injection Scanner | `openfang-skills` | Malicious skill prompts |
|
|
| 14 | Loop Guard | `openfang-runtime` | Stuck agent tool loops |
|
|
| 15 | Session Repair | `openfang-runtime` | Corrupted LLM conversation history |
|
|
| 16 | Health Endpoint Redaction | `openfang-api` | Information leakage |
|
|
|
|
---
|
|
|
|
## 2. Capability-Based Security
|
|
|
|
**Source:** `openfang-types/src/capability.rs`
|
|
|
|
OpenFang uses capability-based security. An agent can only perform actions
|
|
it has been explicitly granted permission to do. Capabilities are immutable
|
|
after agent creation and are enforced at the kernel level.
|
|
|
|
### 2.1 Capability Variants
|
|
|
|
The `Capability` enum defines every permission type:
|
|
|
|
```rust
|
|
pub enum Capability {
|
|
// Filesystem
|
|
FileRead(String), // Glob pattern, e.g. "/data/*"
|
|
FileWrite(String),
|
|
|
|
// Network
|
|
NetConnect(String), // Host:port pattern, e.g. "*.openai.com:443"
|
|
NetListen(u16),
|
|
|
|
// Tools
|
|
ToolInvoke(String), // Specific tool ID
|
|
ToolAll, // All tools (dangerous)
|
|
|
|
// LLM
|
|
LlmQuery(String),
|
|
LlmMaxTokens(u64),
|
|
|
|
// Agent interaction
|
|
AgentSpawn,
|
|
AgentMessage(String),
|
|
AgentKill(String),
|
|
|
|
// Memory
|
|
MemoryRead(String),
|
|
MemoryWrite(String),
|
|
|
|
// Shell
|
|
ShellExec(String),
|
|
EnvRead(String),
|
|
|
|
// OFP Wire Protocol
|
|
OfpDiscover,
|
|
OfpConnect(String),
|
|
OfpAdvertise,
|
|
|
|
// Economic
|
|
EconSpend(f64),
|
|
EconEarn,
|
|
EconTransfer(String),
|
|
}
|
|
```
|
|
|
|
### 2.2 Pattern Matching
|
|
|
|
The `capability_matches(granted, required)` function implements glob-style
|
|
matching:
|
|
|
|
- **Exact match:** `"api.openai.com:443"` matches `"api.openai.com:443"`
|
|
- **Full wildcard:** `"*"` matches anything
|
|
- **Prefix wildcard:** `"*.openai.com:443"` matches `"api.openai.com:443"`
|
|
- **Suffix wildcard:** `"api.*"` matches `"api.openai.com"`
|
|
- **Middle wildcard:** `"api.*.com"` matches `"api.openai.com"`
|
|
- **ToolAll special case:** `ToolAll` grants any `ToolInvoke(_)`
|
|
- **Numeric bounds:** `LlmMaxTokens(10000)` grants `LlmMaxTokens(5000)` (granted >= required)
|
|
|
|
### 2.3 Enforcement Point
|
|
|
|
In the WASM sandbox, every host call is checked **before** execution by
|
|
`check_capability()` in `host_functions.rs`:
|
|
|
|
```rust
|
|
fn check_capability(
|
|
capabilities: &[Capability],
|
|
required: &Capability,
|
|
) -> Result<(), serde_json::Value> {
|
|
for granted in capabilities {
|
|
if capability_matches(granted, required) {
|
|
return Ok(());
|
|
}
|
|
}
|
|
Err(json!({"error": format!("Capability denied: {required:?}")}))
|
|
}
|
|
```
|
|
|
|
If no granted capability matches the required one, the operation returns a
|
|
JSON error immediately -- the tool is never invoked.
|
|
|
|
### 2.4 Capability Inheritance
|
|
|
|
When an agent spawns a child agent, `validate_capability_inheritance()` ensures
|
|
the child's capabilities are a **subset** of the parent's. This prevents
|
|
privilege escalation:
|
|
|
|
```rust
|
|
pub fn validate_capability_inheritance(
|
|
parent_caps: &[Capability],
|
|
child_caps: &[Capability],
|
|
) -> Result<(), String> {
|
|
for child_cap in child_caps {
|
|
let is_covered = parent_caps
|
|
.iter()
|
|
.any(|parent_cap| capability_matches(parent_cap, child_cap));
|
|
if !is_covered {
|
|
return Err(format!(
|
|
"Privilege escalation denied: child requests {:?} \
|
|
but parent does not have a matching grant",
|
|
child_cap
|
|
));
|
|
}
|
|
}
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
The `host_agent_spawn()` function in `host_functions.rs` calls
|
|
`kernel.spawn_agent_checked(manifest_toml, Some(&state.agent_id), &state.capabilities)`
|
|
which invokes this validation before the child is created.
|
|
|
|
---
|
|
|
|
## 3. WASM Dual Metering
|
|
|
|
**Source:** `openfang-runtime/src/sandbox.rs`
|
|
|
|
Untrusted WASM modules run inside a Wasmtime sandbox with **two
|
|
independent** metering mechanisms running simultaneously.
|
|
|
|
### 3.1 Fuel Metering (Deterministic)
|
|
|
|
Fuel metering counts WASM instructions. The engine deducts fuel for every
|
|
instruction executed. When the budget is exhausted, execution traps with
|
|
`Trap::OutOfFuel`.
|
|
|
|
```rust
|
|
// SandboxConfig defaults
|
|
pub fuel_limit: u64, // Default: 1_000_000
|
|
|
|
// Applied at execution time
|
|
if config.fuel_limit > 0 {
|
|
store.set_fuel(config.fuel_limit)?;
|
|
}
|
|
```
|
|
|
|
After execution, fuel consumed is reported:
|
|
|
|
```rust
|
|
let fuel_remaining = store.get_fuel().unwrap_or(0);
|
|
let fuel_consumed = config.fuel_limit.saturating_sub(fuel_remaining);
|
|
```
|
|
|
|
### 3.2 Epoch Interruption (Wall-Clock)
|
|
|
|
A watchdog thread sleeps for the configured timeout, then increments the
|
|
engine epoch. When the epoch advances past the store's deadline, execution
|
|
traps with `Trap::Interrupt`.
|
|
|
|
```rust
|
|
store.set_epoch_deadline(1);
|
|
let engine_clone = engine.clone();
|
|
let timeout = config.timeout_secs.unwrap_or(30);
|
|
let _watchdog = std::thread::spawn(move || {
|
|
std::thread::sleep(std::time::Duration::from_secs(timeout));
|
|
engine_clone.increment_epoch();
|
|
});
|
|
```
|
|
|
|
### 3.3 Why Both?
|
|
|
|
| Property | Fuel | Epoch |
|
|
|----------|------|-------|
|
|
| **Metric** | Instruction count | Wall-clock time |
|
|
| **Precision** | Deterministic, reproducible | Non-deterministic |
|
|
| **Catches** | CPU-intensive loops | Host call blocking, I/O waits |
|
|
| **Evasion** | Can waste time in host calls | Can busy-loop cheaply |
|
|
|
|
Together they form a complete defense: fuel catches compute-intensive loops,
|
|
while epochs catch host-call abuse or environmental slowdowns.
|
|
|
|
### 3.4 SandboxConfig
|
|
|
|
```rust
|
|
pub struct SandboxConfig {
|
|
pub fuel_limit: u64, // Default: 1_000_000
|
|
pub max_memory_bytes: usize, // Default: 16 MB
|
|
pub capabilities: Vec<Capability>,
|
|
pub timeout_secs: Option<u64>, // Default: 30 seconds
|
|
}
|
|
```
|
|
|
|
### 3.5 Error Types
|
|
|
|
```rust
|
|
pub enum SandboxError {
|
|
Compilation(String),
|
|
Instantiation(String),
|
|
Execution(String),
|
|
FuelExhausted, // Trap::OutOfFuel
|
|
AbiError(String),
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Merkle Hash Chain Audit Trail
|
|
|
|
**Source:** `openfang-runtime/src/audit.rs`
|
|
|
|
Every security-critical action is appended to a tamper-evident Merkle hash
|
|
chain, similar to a blockchain. Each entry contains the SHA-256 hash of its
|
|
own contents concatenated with the hash of the previous entry.
|
|
|
|
### 4.1 Auditable Actions
|
|
|
|
```rust
|
|
pub enum AuditAction {
|
|
ToolInvoke,
|
|
CapabilityCheck,
|
|
AgentSpawn,
|
|
AgentKill,
|
|
AgentMessage,
|
|
MemoryAccess,
|
|
FileAccess,
|
|
NetworkAccess,
|
|
ShellExec,
|
|
AuthAttempt,
|
|
WireConnect,
|
|
ConfigChange,
|
|
}
|
|
```
|
|
|
|
### 4.2 Entry Structure
|
|
|
|
```rust
|
|
pub struct AuditEntry {
|
|
pub seq: u64, // Monotonically increasing sequence number
|
|
pub timestamp: String, // ISO-8601
|
|
pub agent_id: String,
|
|
pub action: AuditAction,
|
|
pub detail: String, // e.g. tool name, file path
|
|
pub outcome: String, // "ok", "denied", error message
|
|
pub prev_hash: String, // SHA-256 of previous entry (or 64 zeros)
|
|
pub hash: String, // SHA-256 of this entry + prev_hash
|
|
}
|
|
```
|
|
|
|
### 4.3 Hash Computation
|
|
|
|
Each entry's hash is computed from all of its fields concatenated with the
|
|
previous entry's hash:
|
|
|
|
```rust
|
|
fn compute_entry_hash(
|
|
seq: u64, timestamp: &str, agent_id: &str,
|
|
action: &AuditAction, detail: &str,
|
|
outcome: &str, prev_hash: &str,
|
|
) -> String {
|
|
let mut hasher = Sha256::new();
|
|
hasher.update(seq.to_string().as_bytes());
|
|
hasher.update(timestamp.as_bytes());
|
|
hasher.update(agent_id.as_bytes());
|
|
hasher.update(action.to_string().as_bytes());
|
|
hasher.update(detail.as_bytes());
|
|
hasher.update(outcome.as_bytes());
|
|
hasher.update(prev_hash.as_bytes());
|
|
hex::encode(hasher.finalize())
|
|
}
|
|
```
|
|
|
|
### 4.4 Chain Integrity Verification
|
|
|
|
`AuditLog::verify_integrity()` walks the entire chain and recomputes every
|
|
hash. If any entry has been tampered with, the recomputed hash will not match
|
|
the stored hash, or the `prev_hash` linkage will be broken:
|
|
|
|
```rust
|
|
pub fn verify_integrity(&self) -> Result<(), String> {
|
|
let entries = self.entries.lock().unwrap_or_else(|e| e.into_inner());
|
|
let mut expected_prev = "0".repeat(64); // Genesis sentinel
|
|
|
|
for entry in entries.iter() {
|
|
if entry.prev_hash != expected_prev {
|
|
return Err(format!(
|
|
"chain break at seq {}: expected prev_hash {} but found {}",
|
|
entry.seq, expected_prev, entry.prev_hash
|
|
));
|
|
}
|
|
let recomputed = compute_entry_hash(/* ... */);
|
|
if recomputed != entry.hash {
|
|
return Err(format!(
|
|
"hash mismatch at seq {}: expected {} but found {}",
|
|
entry.seq, recomputed, entry.hash
|
|
));
|
|
}
|
|
expected_prev = entry.hash.clone();
|
|
}
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
### 4.5 Thread Safety
|
|
|
|
`AuditLog` uses `Mutex<Vec<AuditEntry>>` and `Mutex<String>` for the tip hash.
|
|
Both locks use `unwrap_or_else(|e| e.into_inner())` to recover from poisoned
|
|
mutexes, ensuring the audit log remains available even after a panic.
|
|
|
|
### 4.6 API
|
|
|
|
| Method | Description |
|
|
|--------|-------------|
|
|
| `AuditLog::new()` | Creates an empty log with genesis sentinel (`"0" * 64`) |
|
|
| `record(agent_id, action, detail, outcome)` | Appends an entry, returns its hash |
|
|
| `verify_integrity()` | Validates the entire chain |
|
|
| `tip_hash()` | Returns the hash of the most recent entry |
|
|
| `len()` / `is_empty()` | Entry count |
|
|
| `recent(n)` | Returns the most recent `n` entries (cloned) |
|
|
|
|
---
|
|
|
|
## 5. Information Flow Taint Tracking
|
|
|
|
**Source:** `openfang-types/src/taint.rs`
|
|
|
|
OpenFang implements a lattice-based taint propagation model that prevents
|
|
tainted values from flowing into sensitive sinks without explicit
|
|
declassification. This guards against prompt injection, data exfiltration,
|
|
and confused-deputy attacks.
|
|
|
|
### 5.1 Taint Labels
|
|
|
|
```rust
|
|
pub enum TaintLabel {
|
|
ExternalNetwork, // Data from external network requests
|
|
UserInput, // Direct user input
|
|
Pii, // Personally identifiable information
|
|
Secret, // API keys, tokens, passwords
|
|
UntrustedAgent, // Data from sandboxed/untrusted agents
|
|
}
|
|
```
|
|
|
|
### 5.2 Tainted Values
|
|
|
|
```rust
|
|
pub struct TaintedValue {
|
|
pub value: String, // The payload
|
|
pub labels: HashSet<TaintLabel>, // Attached taint labels
|
|
pub source: String, // Human-readable origin
|
|
}
|
|
```
|
|
|
|
Key methods:
|
|
|
|
| Method | Description |
|
|
|--------|-------------|
|
|
| `TaintedValue::new(value, labels, source)` | Create with labels |
|
|
| `TaintedValue::clean(value, source)` | Create with no labels (untainted) |
|
|
| `merge_taint(&mut self, other)` | Union of labels (for concatenation) |
|
|
| `check_sink(&self, sink)` | Check if value can flow to sink |
|
|
| `declassify(&mut self, label)` | Remove a specific label (explicit security decision) |
|
|
| `is_tainted(&self) -> bool` | True if any labels present |
|
|
|
|
### 5.3 Taint Sinks
|
|
|
|
A `TaintSink` defines which labels are **blocked** from reaching it:
|
|
|
|
| Sink | Blocked Labels | Rationale |
|
|
|------|---------------|-----------|
|
|
| `TaintSink::shell_exec()` | `ExternalNetwork`, `UntrustedAgent`, `UserInput` | Prevents command injection |
|
|
| `TaintSink::net_fetch()` | `Secret`, `Pii` | Prevents data exfiltration |
|
|
| `TaintSink::agent_message()` | `Secret` | Prevents secret leakage to other agents |
|
|
|
|
### 5.4 Violation Handling
|
|
|
|
When `check_sink()` finds a blocked label, it returns a `TaintViolation`:
|
|
|
|
```rust
|
|
pub struct TaintViolation {
|
|
pub label: TaintLabel, // The offending label
|
|
pub sink_name: String, // "shell_exec", "net_fetch", etc.
|
|
pub source: String, // Where the tainted value came from
|
|
}
|
|
```
|
|
|
|
Display: `taint violation: label 'Secret' from source 'env_var' is not allowed to reach sink 'net_fetch'`
|
|
|
|
### 5.5 Declassification
|
|
|
|
Declassification is an **explicit security decision**. The caller asserts
|
|
that the value has been sanitized:
|
|
|
|
```rust
|
|
tainted.declassify(&TaintLabel::ExternalNetwork);
|
|
tainted.declassify(&TaintLabel::UserInput);
|
|
// After declassification, value can flow to shell_exec
|
|
assert!(tainted.check_sink(&TaintSink::shell_exec()).is_ok());
|
|
```
|
|
|
|
### 5.6 Taint Propagation
|
|
|
|
When two values are combined (concatenation, interpolation), the result must
|
|
carry the union of both label sets:
|
|
|
|
```rust
|
|
let mut combined = TaintedValue::new(/* ... */);
|
|
combined.merge_taint(&other_value);
|
|
// combined.labels is now the union of both
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Ed25519 Manifest Signing
|
|
|
|
**Source:** `openfang-types/src/manifest_signing.rs`
|
|
|
|
Agent manifests define an agent's capabilities, tools, and configuration.
|
|
A compromised manifest can grant elevated privileges. This module provides
|
|
Ed25519-based cryptographic signing.
|
|
|
|
### 6.1 Signing Scheme
|
|
|
|
1. Compute SHA-256 of the manifest content (raw TOML text).
|
|
2. Sign the hash with Ed25519 (via `ed25519-dalek`).
|
|
3. Bundle the signature, public key, and content hash into a `SignedManifest` envelope.
|
|
|
|
### 6.2 SignedManifest Structure
|
|
|
|
```rust
|
|
pub struct SignedManifest {
|
|
pub manifest: String, // Raw TOML content
|
|
pub content_hash: String, // Hex SHA-256 of manifest
|
|
pub signature: Vec<u8>, // Ed25519 signature (64 bytes)
|
|
pub signer_public_key: Vec<u8>, // Ed25519 public key (32 bytes)
|
|
pub signer_id: String, // Human-readable signer ID
|
|
}
|
|
```
|
|
|
|
### 6.3 Signing
|
|
|
|
```rust
|
|
let signing_key = SigningKey::generate(&mut OsRng);
|
|
let signed = SignedManifest::sign(manifest_toml, &signing_key, "admin@org.com");
|
|
```
|
|
|
|
Internally:
|
|
|
|
```rust
|
|
pub fn sign(manifest: impl Into<String>, signing_key: &SigningKey, signer_id: impl Into<String>) -> Self {
|
|
let manifest = manifest.into();
|
|
let content_hash = hash_manifest(&manifest); // SHA-256
|
|
let signature = signing_key.sign(content_hash.as_bytes());
|
|
let verifying_key = signing_key.verifying_key();
|
|
Self {
|
|
manifest,
|
|
content_hash,
|
|
signature: signature.to_bytes().to_vec(),
|
|
signer_public_key: verifying_key.to_bytes().to_vec(),
|
|
signer_id: signer_id.into(),
|
|
}
|
|
}
|
|
```
|
|
|
|
### 6.4 Verification
|
|
|
|
Two-phase verification:
|
|
|
|
1. **Hash check:** Recompute SHA-256 of `manifest` and compare to `content_hash`.
|
|
2. **Signature check:** Verify the Ed25519 signature over `content_hash` using `signer_public_key`.
|
|
|
|
```rust
|
|
pub fn verify(&self) -> Result<(), String> {
|
|
let recomputed = hash_manifest(&self.manifest);
|
|
if recomputed != self.content_hash {
|
|
return Err("content hash mismatch: ...");
|
|
}
|
|
let verifying_key = VerifyingKey::from_bytes(&pk_bytes)?;
|
|
let signature = Signature::from_bytes(&sig_bytes);
|
|
verifying_key.verify(self.content_hash.as_bytes(), &signature)
|
|
.map_err(|e| format!("signature verification failed: {}", e))
|
|
}
|
|
```
|
|
|
|
### 6.5 Tamper Detection
|
|
|
|
- Modifying the manifest content after signing causes a **content hash mismatch**.
|
|
- Replacing the public key with a different key causes a **signature verification failure**.
|
|
- Both attacks are caught by `verify()`.
|
|
|
|
---
|
|
|
|
## 7. SSRF Protection
|
|
|
|
**Source:** `openfang-runtime/src/host_functions.rs`
|
|
|
|
The `host_net_fetch` function (WASM host call for network requests) includes
|
|
comprehensive Server-Side Request Forgery protection.
|
|
|
|
### 7.1 Scheme Validation
|
|
|
|
Only `http://` and `https://` schemes are allowed. All others (`file://`,
|
|
`gopher://`, `ftp://`) are blocked immediately:
|
|
|
|
```rust
|
|
if !url.starts_with("http://") && !url.starts_with("https://") {
|
|
return Err(json!({"error": "Only http:// and https:// URLs are allowed"}));
|
|
}
|
|
```
|
|
|
|
### 7.2 Hostname Blocklist
|
|
|
|
Before DNS resolution, these hostnames are blocked:
|
|
|
|
- `localhost`
|
|
- `metadata.google.internal`
|
|
- `metadata.aws.internal`
|
|
- `instance-data`
|
|
- `169.254.169.254` (AWS/GCP metadata endpoint)
|
|
|
|
### 7.3 DNS Resolution Check
|
|
|
|
After the hostname blocklist, the function resolves the hostname to IP
|
|
addresses and checks **every resolved IP** against private ranges. This
|
|
defeats DNS rebinding attacks:
|
|
|
|
```rust
|
|
let socket_addr = format!("{hostname}:{port}");
|
|
if let Ok(addrs) = socket_addr.to_socket_addrs() {
|
|
for addr in addrs {
|
|
let ip = addr.ip();
|
|
if ip.is_loopback() || ip.is_unspecified() || is_private_ip(&ip) {
|
|
return Err(json!({"error": format!(
|
|
"SSRF blocked: {hostname} resolves to private IP {ip}"
|
|
)}));
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 7.4 Private IP Detection
|
|
|
|
The `is_private_ip()` function covers:
|
|
|
|
**IPv4:**
|
|
- `10.0.0.0/8` -- RFC 1918
|
|
- `172.16.0.0/12` -- RFC 1918
|
|
- `192.168.0.0/16` -- RFC 1918
|
|
- `169.254.0.0/16` -- Link-local (AWS metadata)
|
|
|
|
**IPv6:**
|
|
- `fc00::/7` -- Unique Local Address
|
|
- `fe80::/10` -- Link-local
|
|
|
|
```rust
|
|
fn is_private_ip(ip: &std::net::IpAddr) -> bool {
|
|
match ip {
|
|
IpAddr::V4(v4) => {
|
|
let octets = v4.octets();
|
|
matches!(
|
|
octets,
|
|
[10, ..] | [172, 16..=31, ..] | [192, 168, ..] | [169, 254, ..]
|
|
)
|
|
}
|
|
IpAddr::V6(v6) => {
|
|
let segments = v6.segments();
|
|
(segments[0] & 0xfe00) == 0xfc00 || (segments[0] & 0xffc0) == 0xfe80
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 7.5 Host Extraction
|
|
|
|
`extract_host_from_url()` parses the URL to extract `host:port` for both
|
|
SSRF checking and capability matching:
|
|
|
|
```
|
|
https://api.openai.com/v1/chat -> api.openai.com:443
|
|
http://localhost:8080/api -> localhost:8080
|
|
http://example.com -> example.com:80
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Secret Zeroization
|
|
|
|
**Source:** All LLM driver modules, channel adapters, and web search modules.
|
|
|
|
OpenFang uses `Zeroizing<String>` from the `zeroize` crate on every field
|
|
that holds secret material. When the value is dropped, its memory is
|
|
overwritten with zeros, preventing secrets from lingering in memory.
|
|
|
|
### 8.1 How It Works
|
|
|
|
`Zeroizing<T>` is a smart-pointer wrapper from the `zeroize` crate. It
|
|
implements `Deref<Target=T>` for transparent usage and `Drop` for automatic
|
|
zeroization:
|
|
|
|
```rust
|
|
// On Drop, the inner String's buffer is overwritten with zeros
|
|
let key = Zeroizing::new("sk-secret-key".to_string());
|
|
// Use key transparently via Deref
|
|
client.post(url).header("authorization", format!("Bearer {}", &*key));
|
|
// When key goes out of scope, memory is zeroed
|
|
```
|
|
|
|
### 8.2 Fields Using Zeroization
|
|
|
|
**LLM Drivers** (`openfang-runtime/src/drivers/`):
|
|
|
|
| Driver | Field |
|
|
|--------|-------|
|
|
| `AnthropicDriver` | `api_key: Zeroizing<String>` |
|
|
| `GeminiDriver` | `api_key: Zeroizing<String>` |
|
|
| `OpenAiCompatDriver` | `api_key: Zeroizing<String>` |
|
|
|
|
**Channel Adapters** (`openfang-channels/src/`):
|
|
|
|
| Adapter | Field(s) |
|
|
|---------|----------|
|
|
| `DiscordAdapter` | `token: Zeroizing<String>` |
|
|
| `EmailAdapter` | `password: Zeroizing<String>` |
|
|
| `BlueskyAdapter` | `app_password: Zeroizing<String>` |
|
|
| `DingTalkAdapter` | `access_token: Zeroizing<String>`, `secret: Zeroizing<String>` |
|
|
| `FeishuAdapter` | `app_secret: Zeroizing<String>` |
|
|
| `FlockAdapter` | `bot_token: Zeroizing<String>` |
|
|
| `GitterAdapter` | `token: Zeroizing<String>` |
|
|
| `GotifyAdapter` | `app_token: Zeroizing<String>`, `client_token: Zeroizing<String>` |
|
|
|
|
**Web Search** (`openfang-runtime/src/web_search.rs`):
|
|
|
|
```rust
|
|
fn resolve_api_key(env_var: &str) -> Option<Zeroizing<String>> {
|
|
std::env::var(env_var).ok().filter(|k| !k.is_empty()).map(Zeroizing::new)
|
|
}
|
|
```
|
|
|
|
**Embedding** (`openfang-runtime/src/embedding.rs`):
|
|
|
|
| Struct | Field |
|
|
|--------|-------|
|
|
| `EmbeddingClient` | `api_key: Zeroizing<String>` |
|
|
|
|
### 8.3 Why It Matters
|
|
|
|
Without zeroization, secrets remain in memory after use until the OS
|
|
reclaims the page. An attacker with access to a core dump, swap file, or
|
|
memory forensics tool can recover API keys. `Zeroizing<String>` ensures
|
|
the secret is overwritten as soon as it is no longer needed.
|
|
|
|
---
|
|
|
|
## 9. OFP Mutual Authentication
|
|
|
|
**Source:** `openfang-wire/src/peer.rs`
|
|
|
|
The OpenFang Wire Protocol (OFP) uses HMAC-SHA256 with nonce-based mutual
|
|
authentication over TCP connections.
|
|
|
|
### 9.1 Pre-Shared Key Requirement
|
|
|
|
OFP refuses to start without a `shared_secret`:
|
|
|
|
```rust
|
|
if config.shared_secret.is_empty() {
|
|
return Err(WireError::HandshakeFailed(
|
|
"OFP requires shared_secret. Set [network] shared_secret in config.toml".into(),
|
|
));
|
|
}
|
|
```
|
|
|
|
### 9.2 HMAC Functions
|
|
|
|
```rust
|
|
type HmacSha256 = Hmac<Sha256>;
|
|
|
|
fn hmac_sign(secret: &str, data: &[u8]) -> String {
|
|
let mut mac = HmacSha256::new_from_slice(secret.as_bytes())
|
|
.expect("HMAC accepts any key size");
|
|
mac.update(data);
|
|
hex::encode(mac.finalize().into_bytes())
|
|
}
|
|
|
|
fn hmac_verify(secret: &str, data: &[u8], signature: &str) -> bool {
|
|
let expected = hmac_sign(secret, data);
|
|
subtle::ConstantTimeEq::ct_eq(expected.as_bytes(), signature.as_bytes()).into()
|
|
}
|
|
```
|
|
|
|
**Constant-time comparison** (`subtle::ConstantTimeEq`) prevents
|
|
timing side-channel attacks.
|
|
|
|
### 9.3 Handshake Protocol
|
|
|
|
**Initiator (client):**
|
|
|
|
1. Generate a random UUID nonce.
|
|
2. Compute `auth_data = nonce + node_id`.
|
|
3. Compute `auth_hmac = hmac_sign(shared_secret, auth_data)`.
|
|
4. Send `Handshake { node_id, node_name, protocol_version, agents, nonce, auth_hmac }`.
|
|
|
|
**Responder (server):**
|
|
|
|
1. Receive the `Handshake` message.
|
|
2. Verify the incoming HMAC: `hmac_verify(shared_secret, nonce + node_id, auth_hmac)`.
|
|
3. If verification fails, return error code 403.
|
|
4. Generate a new UUID nonce for the ack.
|
|
5. Compute `ack_auth_data = ack_nonce + self.node_id`.
|
|
6. Compute `ack_hmac = hmac_sign(shared_secret, ack_auth_data)`.
|
|
7. Send `HandshakeAck { node_id, node_name, protocol_version, agents, nonce: ack_nonce, auth_hmac: ack_hmac }`.
|
|
|
|
**Initiator (verification):**
|
|
|
|
1. Receive `HandshakeAck`.
|
|
2. Verify: `hmac_verify(shared_secret, ack_nonce + node_id, ack_hmac)`.
|
|
3. If verification fails, return `WireError::HandshakeFailed`.
|
|
|
|
### 9.4 Security Properties
|
|
|
|
| Property | How It Is Achieved |
|
|
|----------|-------------------|
|
|
| **Mutual authentication** | Both sides prove knowledge of the shared secret |
|
|
| **Replay protection** | Random UUID nonces per handshake |
|
|
| **Timing-attack resistance** | `subtle::ConstantTimeEq` for HMAC comparison |
|
|
| **Mandatory secret** | OFP refuses to start with an empty `shared_secret` |
|
|
| **Message size limit** | `MAX_MESSAGE_SIZE = 16 MB` prevents memory DoS |
|
|
| **Protocol version check** | `PROTOCOL_VERSION` mismatch returns `WireError::VersionMismatch` |
|
|
|
|
---
|
|
|
|
## 10. Security Headers
|
|
|
|
**Source:** `openfang-api/src/middleware.rs`
|
|
|
|
The `security_headers` middleware is applied to **all** API responses:
|
|
|
|
```rust
|
|
pub async fn security_headers(request: Request<Body>, next: Next) -> Response<Body> {
|
|
let mut response = next.run(request).await;
|
|
let headers = response.headers_mut();
|
|
headers.insert("x-content-type-options", "nosniff".parse().unwrap());
|
|
headers.insert("x-frame-options", "DENY".parse().unwrap());
|
|
headers.insert("x-xss-protection", "1; mode=block".parse().unwrap());
|
|
headers.insert("content-security-policy", /* CSP policy */);
|
|
headers.insert("referrer-policy", "strict-origin-when-cross-origin".parse().unwrap());
|
|
headers.insert("cache-control", "no-store, no-cache, must-revalidate".parse().unwrap());
|
|
response
|
|
}
|
|
```
|
|
|
|
| Header | Value | Protects Against |
|
|
|--------|-------|------------------|
|
|
| `X-Content-Type-Options` | `nosniff` | MIME type sniffing attacks |
|
|
| `X-Frame-Options` | `DENY` | Clickjacking via iframes |
|
|
| `X-XSS-Protection` | `1; mode=block` | Reflected XSS (legacy browsers) |
|
|
| `Content-Security-Policy` | See below | XSS, code injection, data exfiltration |
|
|
| `Referrer-Policy` | `strict-origin-when-cross-origin` | Referrer leakage |
|
|
| `Cache-Control` | `no-store, no-cache, must-revalidate` | Sensitive data caching |
|
|
|
|
### 10.1 CSP Breakdown
|
|
|
|
| Directive | Value | Purpose |
|
|
|-----------|-------|---------|
|
|
| `default-src` | `'self'` | Deny all external resources by default |
|
|
| `script-src` | `'self' 'unsafe-inline' 'unsafe-eval' cdn.jsdelivr.net` | Allow scripts from self and CDN |
|
|
| `style-src` | `'self' 'unsafe-inline' cdn.jsdelivr.net fonts.googleapis.com` | Allow styles from self, CDN, Google Fonts |
|
|
| `img-src` | `'self' data:` | Allow images from self and data URIs |
|
|
| `connect-src` | `'self' ws: wss:` | Allow WebSocket connections |
|
|
| `font-src` | `'self' cdn.jsdelivr.net fonts.gstatic.com` | Allow fonts from CDN |
|
|
| `object-src` | `'none'` | Block all plugins (Flash, Java, etc.) |
|
|
| `base-uri` | `'self'` | Prevent base tag hijacking |
|
|
| `form-action` | `'self'` | Restrict form submission targets |
|
|
|
|
---
|
|
|
|
## 11. GCRA Rate Limiter
|
|
|
|
**Source:** `openfang-api/src/rate_limiter.rs`
|
|
|
|
OpenFang uses the Generic Cell Rate Algorithm (GCRA) for cost-aware API
|
|
rate limiting via the `governor` crate.
|
|
|
|
### 11.1 Algorithm
|
|
|
|
GCRA is a leaky-bucket variant that tracks a single "virtual scheduling time"
|
|
(TAT -- Theoretical Arrival Time) per key. Each request consumes a number of
|
|
tokens proportional to its cost. The bucket refills at a constant rate.
|
|
|
|
**Budget:** 500 tokens per minute per IP address.
|
|
|
|
```rust
|
|
pub fn create_rate_limiter() -> Arc<KeyedRateLimiter> {
|
|
Arc::new(RateLimiter::keyed(Quota::per_minute(NonZeroU32::new(500).unwrap())))
|
|
}
|
|
```
|
|
|
|
### 11.2 Operation Costs
|
|
|
|
Each API operation has a configurable token cost:
|
|
|
|
```rust
|
|
pub fn operation_cost(method: &str, path: &str) -> NonZeroU32 {
|
|
match (method, path) {
|
|
(_, "/api/health") => 1,
|
|
("GET", "/api/status") => 1,
|
|
("GET", "/api/version") => 1,
|
|
("GET", "/api/tools") => 1,
|
|
("GET", "/api/agents") => 2,
|
|
("GET", "/api/skills") => 2,
|
|
("GET", "/api/peers") => 2,
|
|
("GET", "/api/config") => 2,
|
|
("GET", "/api/usage") => 3,
|
|
("GET", p) if p.starts_with("/api/audit") => 5,
|
|
("GET", p) if p.starts_with("/api/marketplace")=> 10,
|
|
("POST", "/api/agents") => 50,
|
|
("POST", p) if p.contains("/message") => 30,
|
|
("POST", p) if p.contains("/run") => 100,
|
|
("POST", "/api/skills/install") => 50,
|
|
("POST", "/api/skills/uninstall") => 10,
|
|
("POST", "/api/migrate") => 100,
|
|
("PUT", p) if p.contains("/update") => 10,
|
|
_ => 5,
|
|
}
|
|
}
|
|
```
|
|
|
|
The cost hierarchy is intentional: read-only health checks cost 1 token while
|
|
expensive operations like workflow runs cost 100, meaning a client can perform
|
|
500 health checks per minute but only 5 workflow runs.
|
|
|
|
### 11.3 Middleware
|
|
|
|
```rust
|
|
pub async fn gcra_rate_limit(
|
|
State(limiter): State<Arc<KeyedRateLimiter>>,
|
|
request: Request<Body>,
|
|
next: Next,
|
|
) -> Response<Body> {
|
|
let ip = /* extract from ConnectInfo, default 127.0.0.1 */;
|
|
let cost = operation_cost(&method, &path);
|
|
|
|
if limiter.check_key_n(&ip, cost).is_err() {
|
|
tracing::warn!(ip, cost, path, "GCRA rate limit exceeded");
|
|
return Response::builder()
|
|
.status(StatusCode::TOO_MANY_REQUESTS)
|
|
.header("retry-after", "60")
|
|
.body(/* JSON error */)
|
|
.unwrap_or_default();
|
|
}
|
|
next.run(request).await
|
|
}
|
|
```
|
|
|
|
### 11.4 Rate Limiter Type
|
|
|
|
```rust
|
|
pub type KeyedRateLimiter = RateLimiter<IpAddr, DashMapStateStore<IpAddr>, DefaultClock>;
|
|
```
|
|
|
|
The `DashMapStateStore` provides concurrent per-IP state with automatic stale
|
|
entry cleanup.
|
|
|
|
---
|
|
|
|
## 12. Path Traversal Prevention
|
|
|
|
**Source:** `openfang-runtime/src/host_functions.rs`
|
|
|
|
Two functions provide defense-in-depth against directory traversal.
|
|
|
|
### 12.1 safe_resolve_path (for reads)
|
|
|
|
Used for `fs_read` and `fs_list` operations where the target file must exist:
|
|
|
|
```rust
|
|
fn safe_resolve_path(path: &str) -> Result<std::path::PathBuf, serde_json::Value> {
|
|
let p = Path::new(path);
|
|
|
|
// Phase 1: Reject any path with ".." components
|
|
for component in p.components() {
|
|
if matches!(component, Component::ParentDir) {
|
|
return Err(json!({"error": "Path traversal denied: '..' components forbidden"}));
|
|
}
|
|
}
|
|
|
|
// Phase 2: Canonicalize to resolve symlinks and normalize
|
|
std::fs::canonicalize(p)
|
|
.map_err(|e| json!({"error": format!("Cannot resolve path: {e}")}))
|
|
}
|
|
```
|
|
|
|
### 12.2 safe_resolve_parent (for writes)
|
|
|
|
Used for `fs_write` operations where the target file may not exist yet:
|
|
|
|
```rust
|
|
fn safe_resolve_parent(path: &str) -> Result<std::path::PathBuf, serde_json::Value> {
|
|
let p = Path::new(path);
|
|
|
|
// Phase 1: Reject ".." in any component
|
|
for component in p.components() {
|
|
if matches!(component, Component::ParentDir) {
|
|
return Err(json!({"error": "Path traversal denied: '..' components forbidden"}));
|
|
}
|
|
}
|
|
|
|
// Phase 2: Canonicalize the parent directory
|
|
let parent = p.parent().filter(|par| !par.as_os_str().is_empty())
|
|
.ok_or_else(|| json!({"error": "Invalid path: no parent directory"}))?;
|
|
let canonical_parent = std::fs::canonicalize(parent)?;
|
|
|
|
// Phase 3: Belt-and-suspenders check on filename
|
|
let file_name = p.file_name()
|
|
.ok_or_else(|| json!({"error": "Invalid path: no file name"}))?;
|
|
if file_name.to_string_lossy().contains("..") {
|
|
return Err(json!({"error": "Path traversal denied in file name"}));
|
|
}
|
|
|
|
Ok(canonical_parent.join(file_name))
|
|
}
|
|
```
|
|
|
|
### 12.3 Enforcement Order
|
|
|
|
1. **Capability check** runs first with the raw path.
|
|
2. **Path traversal check** runs second.
|
|
3. **Operation** runs only if both pass.
|
|
|
|
This ordering ensures that even if a capability is misconfigured with a broad
|
|
pattern like `"*"`, path traversal is still blocked.
|
|
|
|
---
|
|
|
|
## 13. Subprocess Sandbox
|
|
|
|
**Source:** `openfang-runtime/src/subprocess_sandbox.rs`
|
|
|
|
When the runtime spawns child processes (e.g., for the shell tool or skill
|
|
execution), the inherited environment must be stripped to prevent accidental
|
|
leakage of secrets.
|
|
|
|
### 13.1 Environment Clearing
|
|
|
|
```rust
|
|
pub fn sandbox_command(cmd: &mut tokio::process::Command, allowed_env_vars: &[String]) {
|
|
cmd.env_clear(); // Remove ALL inherited env vars
|
|
|
|
// Re-add platform-independent safe vars
|
|
for var in SAFE_ENV_VARS {
|
|
if let Ok(val) = std::env::var(var) {
|
|
cmd.env(var, val);
|
|
}
|
|
}
|
|
|
|
// Re-add Windows-specific safe vars (on Windows)
|
|
#[cfg(windows)]
|
|
for var in SAFE_ENV_VARS_WINDOWS { /* ... */ }
|
|
|
|
// Re-add caller-specified allowed vars
|
|
for var in allowed_env_vars { /* ... */ }
|
|
}
|
|
```
|
|
|
|
### 13.2 Safe Environment Variables
|
|
|
|
**All platforms:**
|
|
|
|
```rust
|
|
pub const SAFE_ENV_VARS: &[&str] = &[
|
|
"PATH", "HOME", "TMPDIR", "TMP", "TEMP", "LANG", "LC_ALL", "TERM",
|
|
];
|
|
```
|
|
|
|
**Windows-only:**
|
|
|
|
```rust
|
|
pub const SAFE_ENV_VARS_WINDOWS: &[&str] = &[
|
|
"USERPROFILE", "SYSTEMROOT", "APPDATA", "LOCALAPPDATA",
|
|
"COMSPEC", "WINDIR", "PATHEXT",
|
|
];
|
|
```
|
|
|
|
Variables not in these lists and not in `allowed_env_vars` are **never**
|
|
passed to the child process. This means `OPENAI_API_KEY`, `GEMINI_API_KEY`,
|
|
database credentials, and all other secrets are stripped.
|
|
|
|
### 13.3 Executable Path Validation
|
|
|
|
```rust
|
|
pub fn validate_executable_path(path: &str) -> Result<(), String> {
|
|
let p = Path::new(path);
|
|
for component in p.components() {
|
|
if let std::path::Component::ParentDir = component {
|
|
return Err(format!(
|
|
"executable path '{}' contains '..' component which is not allowed",
|
|
path
|
|
));
|
|
}
|
|
}
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
This prevents an agent from escaping its working directory via crafted paths
|
|
like `../../bin/dangerous`.
|
|
|
|
### 13.4 Shell Injection Prevention
|
|
|
|
The `host_shell_exec` function uses `Command::new(command).args(&args)` which
|
|
does **not** invoke a shell. Each argument is passed directly to the
|
|
process, preventing shell injection via metacharacters like `;`, `|`, `&&`.
|
|
|
|
---
|
|
|
|
## 14. Prompt Injection Scanner
|
|
|
|
**Source:** `openfang-skills/src/verify.rs`
|
|
|
|
The `SkillVerifier` provides two scanning functions: `security_scan()` for
|
|
skill manifests and `scan_prompt_content()` for skill prompt text (SKILL.md
|
|
body).
|
|
|
|
### 14.1 Manifest Security Scan
|
|
|
|
`SkillVerifier::security_scan(manifest)` inspects a skill's declared
|
|
requirements:
|
|
|
|
| Check | Severity | Trigger |
|
|
|-------|----------|---------|
|
|
| Node.js runtime | Warning | `runtime_type == SkillRuntime::Node` |
|
|
| Shell execution capability | Critical | Capability contains `shellexec` or `shell_exec` |
|
|
| Unrestricted network | Warning | Capability contains `netconnect(*)` |
|
|
| Shell tool | Critical | Tool is `shell_exec` or `bash` |
|
|
| Filesystem write tool | Warning | Tool is `file_write` or `file_delete` |
|
|
| Too many tools | Info | More than 10 tools required |
|
|
|
|
### 14.2 Prompt Injection Scan
|
|
|
|
`SkillVerifier::scan_prompt_content(content)` detects common attack patterns
|
|
in skill prompt text:
|
|
|
|
**Critical -- Prompt override attempts:**
|
|
|
|
```
|
|
"ignore previous instructions", "ignore all previous",
|
|
"disregard previous", "forget your instructions",
|
|
"you are now", "new instructions:", "system prompt override",
|
|
"ignore the above", "do not follow", "override system"
|
|
```
|
|
|
|
**Warning -- Data exfiltration patterns:**
|
|
|
|
```
|
|
"send to http", "send to https", "post to http", "post to https",
|
|
"exfiltrate", "forward all", "send all data",
|
|
"base64 encode and send", "upload to"
|
|
```
|
|
|
|
**Warning -- Shell command references:**
|
|
|
|
```
|
|
"rm -rf", "chmod ", "sudo "
|
|
```
|
|
|
|
**Info -- Excessive length:**
|
|
|
|
Content over 50,000 bytes triggers an info-level warning about potential LLM
|
|
performance degradation.
|
|
|
|
### 14.3 SHA256 Checksum Verification
|
|
|
|
```rust
|
|
pub fn verify_checksum(data: &[u8], expected_sha256: &str) -> bool {
|
|
let actual = Self::sha256_hex(data);
|
|
actual == expected_sha256.to_lowercase()
|
|
}
|
|
```
|
|
|
|
Skills installed from ClawHub have their content verified against a known
|
|
SHA256 hash to detect tampering during download.
|
|
|
|
### 14.4 Warning Structure
|
|
|
|
```rust
|
|
pub struct SkillWarning {
|
|
pub severity: WarningSeverity, // Info, Warning, Critical
|
|
pub message: String,
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 15. Loop Guard
|
|
|
|
**Source:** `openfang-runtime/src/loop_guard.rs`
|
|
|
|
The `LoopGuard` tracks tool calls within a single agent loop execution to
|
|
detect when the agent is stuck calling the same tool repeatedly.
|
|
|
|
### 15.1 Configuration
|
|
|
|
```rust
|
|
pub struct LoopGuardConfig {
|
|
pub warn_threshold: u32, // Default: 3
|
|
pub block_threshold: u32, // Default: 5
|
|
pub global_circuit_breaker: u32, // Default: 30
|
|
}
|
|
```
|
|
|
|
### 15.2 Detection Algorithm
|
|
|
|
1. For each tool call, compute SHA-256 of `tool_name + "|" + serialized_params`.
|
|
2. Increment the count for that hash in a `HashMap<String, u32>`.
|
|
3. Increment `total_calls`.
|
|
4. Return a graduated verdict:
|
|
|
|
```rust
|
|
pub fn check(&mut self, tool_name: &str, params: &serde_json::Value) -> LoopGuardVerdict {
|
|
self.total_calls += 1;
|
|
|
|
// Global circuit breaker
|
|
if self.total_calls > self.config.global_circuit_breaker {
|
|
return LoopGuardVerdict::CircuitBreak(/* ... */);
|
|
}
|
|
|
|
let hash = Self::compute_hash(tool_name, params);
|
|
let count = self.call_counts.entry(hash).or_insert(0);
|
|
*count += 1;
|
|
|
|
if *count >= self.config.block_threshold {
|
|
LoopGuardVerdict::Block(/* ... */)
|
|
} else if *count >= self.config.warn_threshold {
|
|
LoopGuardVerdict::Warn(/* ... */)
|
|
} else {
|
|
LoopGuardVerdict::Allow
|
|
}
|
|
}
|
|
```
|
|
|
|
### 15.3 Verdict Types
|
|
|
|
| Verdict | Meaning | Action |
|
|
|---------|---------|--------|
|
|
| `Allow` | Normal operation | Run the tool |
|
|
| `Warn(msg)` | Same call repeated >= 3 times | Run, append warning to result |
|
|
| `Block(msg)` | Same call repeated >= 5 times | Skip execution, return error |
|
|
| `CircuitBreak(msg)` | > 30 total tool calls | Terminate the entire agent loop |
|
|
|
|
### 15.4 Hash Computation
|
|
|
|
```rust
|
|
fn compute_hash(tool_name: &str, params: &serde_json::Value) -> String {
|
|
let mut hasher = Sha256::new();
|
|
hasher.update(tool_name.as_bytes());
|
|
hasher.update(b"|");
|
|
let params_str = serde_json::to_string(params).unwrap_or_default();
|
|
hasher.update(params_str.as_bytes());
|
|
hex::encode(hasher.finalize())
|
|
}
|
|
```
|
|
|
|
Note: `serde_json::to_string` produces deterministic output (object keys are
|
|
sorted), ensuring that semantically identical parameters produce the same hash.
|
|
|
|
### 15.5 Key Property
|
|
|
|
Calls with **different parameters** are tracked separately. An agent that
|
|
calls `web_search` with 10 different queries will not trigger the guard, but
|
|
an agent that calls `web_search({"query": "test"})` 5 times will be blocked.
|
|
|
|
---
|
|
|
|
## 16. Session Repair
|
|
|
|
**Source:** `openfang-runtime/src/session_repair.rs`
|
|
|
|
Before sending message history to the LLM, this module validates and repairs
|
|
common structural issues that would cause API errors.
|
|
|
|
### 16.1 Three-Phase Repair
|
|
|
|
```rust
|
|
pub fn validate_and_repair(messages: &[Message]) -> Vec<Message>
|
|
```
|
|
|
|
**Phase 1 -- Collect ToolUse IDs:**
|
|
|
|
Scan all messages for `ContentBlock::ToolUse { id, .. }` blocks and collect
|
|
their IDs into a `HashSet<String>`.
|
|
|
|
**Phase 2 -- Filter orphans and empties:**
|
|
|
|
- **Orphaned ToolResults:** `ContentBlock::ToolResult { tool_use_id, .. }`
|
|
blocks where `tool_use_id` is not in the ToolUse ID set are dropped.
|
|
- **Empty messages:** Messages with empty text or no content blocks are
|
|
dropped.
|
|
|
|
**Phase 3 -- Merge consecutive same-role messages:**
|
|
|
|
The Anthropic API requires strict role alternation (user, assistant, user,
|
|
assistant...). If two consecutive messages have the same role, they are
|
|
merged into a single message with combined content blocks.
|
|
|
|
### 16.2 Why Each Repair Is Needed
|
|
|
|
| Issue | Cause | Effect Without Repair |
|
|
|-------|-------|----------------------|
|
|
| Orphaned ToolResult | Compaction or truncation removed the ToolUse | API error: "tool_use_id not found" |
|
|
| Empty messages | Cancelled generation, empty user submission | API error: empty content |
|
|
| Consecutive same-role | Manual history editing, session repair itself | API error: role alternation violation |
|
|
|
|
### 16.3 Content Merging
|
|
|
|
When merging consecutive same-role messages, both are converted to block
|
|
format and concatenated:
|
|
|
|
```rust
|
|
fn merge_content(dst: &mut MessageContent, src: MessageContent) {
|
|
let dst_blocks = content_to_blocks(std::mem::replace(dst, MessageContent::Text(String::new())));
|
|
let src_blocks = content_to_blocks(src);
|
|
let mut combined = dst_blocks;
|
|
combined.extend(src_blocks);
|
|
*dst = MessageContent::Blocks(combined);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 17. Health Endpoint Redaction
|
|
|
|
**Source:** `openfang-api/src/routes.rs`
|
|
|
|
OpenFang provides two health endpoints with different information levels.
|
|
|
|
### 17.1 Public Endpoint: `GET /api/health`
|
|
|
|
**No authentication required.** Returns only liveness information:
|
|
|
|
```json
|
|
{
|
|
"status": "ok",
|
|
"version": "0.1.0"
|
|
}
|
|
```
|
|
|
|
This endpoint does not expose agent count, database details, configuration
|
|
warnings, uptime, or any internal system information. It is suitable for
|
|
load balancer health checks.
|
|
|
|
### 17.2 Detail Endpoint: `GET /api/health/detail`
|
|
|
|
**Requires authentication.** Returns full diagnostics:
|
|
|
|
```json
|
|
{
|
|
"status": "ok",
|
|
"version": "0.1.0",
|
|
"uptime_seconds": 3600,
|
|
"panic_count": 0,
|
|
"restart_count": 2,
|
|
"agent_count": 15,
|
|
"database": "connected",
|
|
"config_warnings": []
|
|
}
|
|
```
|
|
|
|
### 17.3 Localhost Fallback
|
|
|
|
When no API key is configured, the `auth` middleware restricts all
|
|
non-health endpoints to loopback addresses only:
|
|
|
|
```rust
|
|
if api_key.is_empty() {
|
|
let is_loopback = request.extensions()
|
|
.get::<ConnectInfo<SocketAddr>>()
|
|
.map(|ci| ci.0.ip().is_loopback())
|
|
.unwrap_or(false);
|
|
if !is_loopback {
|
|
return Response::builder()
|
|
.status(StatusCode::FORBIDDEN)
|
|
.body(/* "No API key configured. Remote access denied." */)
|
|
...;
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 18. Security Configuration
|
|
|
|
### 18.1 config.toml Reference
|
|
|
|
```toml
|
|
# API Authentication
|
|
api_key = "your-secret-api-key" # Empty = localhost-only mode
|
|
|
|
# OFP Wire Protocol
|
|
[network]
|
|
shared_secret = "your-pre-shared-key" # Required for OFP
|
|
|
|
# WASM Sandbox
|
|
[sandbox]
|
|
fuel_limit = 1000000 # CPU instruction budget per execution
|
|
timeout_secs = 30 # Wall-clock timeout per execution
|
|
max_memory_bytes = 16777216 # 16 MB max WASM memory
|
|
|
|
# Rate Limiting
|
|
# 500 tokens/minute/IP (not currently configurable via config.toml)
|
|
|
|
# Web Search SSRF Protection
|
|
[web]
|
|
# SSRF protection is always on and cannot be disabled
|
|
```
|
|
|
|
### 18.2 Environment Variables for Secrets
|
|
|
|
| Variable | Used By |
|
|
|----------|---------|
|
|
| `OPENAI_API_KEY` | OpenAI-compat driver |
|
|
| `ANTHROPIC_API_KEY` | Anthropic driver |
|
|
| `GEMINI_API_KEY` or `GOOGLE_API_KEY` | Gemini driver |
|
|
| `DEEPSEEK_API_KEY` | DeepSeek provider |
|
|
| `GROQ_API_KEY` | Groq provider |
|
|
| `BRAVE_API_KEY` | Brave web search |
|
|
| `TAVILY_API_KEY` | Tavily web search |
|
|
| `PERPLEXITY_API_KEY` | Perplexity web search |
|
|
|
|
All environment variable API keys are wrapped in `Zeroizing<String>` when
|
|
loaded into driver structs.
|
|
|
|
### 18.3 Capability Declaration (Agent Manifest)
|
|
|
|
Capabilities are declared in the agent's TOML manifest:
|
|
|
|
```toml
|
|
[agent]
|
|
name = "my-agent"
|
|
|
|
[[capabilities]]
|
|
type = "FileRead"
|
|
value = "/data/*"
|
|
|
|
[[capabilities]]
|
|
type = "NetConnect"
|
|
value = "*.openai.com:443"
|
|
|
|
[[capabilities]]
|
|
type = "ToolInvoke"
|
|
value = "web_search"
|
|
|
|
[[capabilities]]
|
|
type = "LlmMaxTokens"
|
|
value = 4096
|
|
```
|
|
|
|
### 18.4 Loop Guard Tuning
|
|
|
|
The default `LoopGuardConfig` values are:
|
|
|
|
| Parameter | Default | Description |
|
|
|-----------|---------|-------------|
|
|
| `warn_threshold` | 3 | Identical calls before warning |
|
|
| `block_threshold` | 5 | Identical calls before blocking |
|
|
| `global_circuit_breaker` | 30 | Total calls before circuit break |
|
|
|
|
### 18.5 Subprocess Sandbox Allowlists
|
|
|
|
To pass specific environment variables to subprocesses:
|
|
|
|
```rust
|
|
sandbox_command(&mut cmd, &["MY_CUSTOM_VAR".to_string()]);
|
|
```
|
|
|
|
Only variables explicitly listed in `allowed_env_vars` (plus the safe
|
|
defaults) will be inherited.
|
|
|
|
---
|
|
|
|
## 19. Security Dependencies
|
|
|
|
| Crate | Purpose |
|
|
|-------|---------|
|
|
| `sha2` | SHA-256 hashing (audit trail, loop guard, SSRF, checksums) |
|
|
| `hmac` | HMAC-SHA256 for OFP authentication |
|
|
| `hex` | Hex encoding/decoding of hashes and signatures |
|
|
| `subtle` | Constant-time comparison (`ConstantTimeEq`) for HMAC verification |
|
|
| `ed25519-dalek` | Ed25519 signing/verification for manifest signing |
|
|
| `rand` | Cryptographic RNG for key generation (`OsRng`) |
|
|
| `zeroize` | `Zeroizing<T>` wrapper for automatic secret memory wiping |
|
|
| `governor` | GCRA rate limiting algorithm |
|
|
| `wasmtime` | WASM sandbox with fuel + epoch metering |
|
|
| `uuid` | Nonce generation for OFP handshakes |
|
|
| `chrono` | ISO-8601 timestamps for audit entries |
|
|
| `reqwest` | HTTP client (used inside SSRF-protected `host_net_fetch`) |
|
|
|
|
### 19.1 Why These Specific Crates
|
|
|
|
- **sha2/hmac:** Part of the RustCrypto project, audited, widely used in production Rust.
|
|
- **ed25519-dalek:** De facto standard Ed25519 library in Rust, extensively audited.
|
|
- **subtle:** Provides constant-time operations to prevent timing side-channels.
|
|
- **zeroize:** Official RustCrypto approach to zeroing secrets; integrates with `Drop`.
|
|
- **governor:** Battle-tested GCRA implementation with `DashMap`-backed concurrent state.
|
|
|
|
---
|
|
|
|
## Threat Model Summary
|
|
|
|
| Threat | Mitigated By |
|
|
|--------|-------------|
|
|
| Agent requests unauthorized file access | Capability-based security (Section 2) |
|
|
| Agent spawns child with elevated privileges | Capability inheritance validation (Section 2.4) |
|
|
| WASM skill runs infinite loop | Dual metering: fuel + epoch (Section 3) |
|
|
| Attacker tampers with audit log | Merkle hash chain (Section 4) |
|
|
| Prompt injection via external data | Taint tracking (Section 5) |
|
|
| Data exfiltration via LLM | Taint sinks block Secret/PII to net_fetch (Section 5.3) |
|
|
| Tampered agent manifest | Ed25519 signing (Section 6) |
|
|
| SSRF to cloud metadata | Private IP + hostname blocking + DNS check (Section 7) |
|
|
| API key recovery from memory dump | Zeroizing<String> (Section 8) |
|
|
| Unauthorized peer-to-peer connections | HMAC-SHA256 mutual auth (Section 9) |
|
|
| XSS / clickjacking on API | Security headers (Section 10) |
|
|
| API brute force / DoS | GCRA rate limiter (Section 11) |
|
|
| Path traversal via `../` | safe_resolve_path / safe_resolve_parent (Section 12) |
|
|
| Secret leakage to child processes | env_clear() + allowlist (Section 13) |
|
|
| Malicious skills from ClawHub | Prompt injection scanner + SHA256 checksum (Section 14) |
|
|
| Agent stuck in tool loop | LoopGuard with graduated response (Section 15) |
|
|
| Corrupted LLM session history | Session repair (Section 16) |
|
|
| Information leakage from health endpoint | Redacted public endpoint (Section 17) |
|
|
| Timing attacks on HMAC verification | subtle::ConstantTimeEq (Section 9.2) |
|
|
| Shell injection via metacharacters | Command::new (no shell) + env_clear (Section 13.4) |
|
|
| DNS rebinding for SSRF bypass | Resolved IP check, not hostname check (Section 7.3) |
|