fix(saas-relay): eliminate DATABASE_ERROR by removing DB queries from critical path
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled

Root cause: each relay request executes 13-17 serial DB queries, exhausting
the 50-connection pool under concurrency. When pool is exhausted, sqlx returns
PoolTimedOut which maps to 500 DATABASE_ERROR.

Fixes:
1. log_operation → dispatch_log_operation (async Worker dispatch, non-blocking)
2. record_usage → tokio::spawn (3 DB queries moved off critical path)
3. DB pool: max_connections 50→100 (env-configurable), acquire_timeout 5s→8s

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
iven
2026-03-31 14:08:21 +08:00
parent 2ff696289f
commit 9905a8d0d5
2 changed files with 78 additions and 31 deletions

View File

@@ -8,10 +8,22 @@ const SCHEMA_VERSION: i32 = 11;
/// 初始化数据库
pub async fn init_db(database_url: &str) -> SaasResult<PgPool> {
// 连接池大小可通过环境变量配置,默认 100relay 请求每次 10+ 串行查询50 偏紧)
let max_connections: u32 = std::env::var("ZCLAW_DB_MAX_CONNECTIONS")
.ok()
.and_then(|v| v.parse().ok())
.unwrap_or(100);
let min_connections: u32 = std::env::var("ZCLAW_DB_MIN_CONNECTIONS")
.ok()
.and_then(|v| v.parse().ok())
.unwrap_or(5);
tracing::info!("Database pool: max={}, min={}", max_connections, min_connections);
let pool = PgPoolOptions::new()
.max_connections(50)
.min_connections(3)
.acquire_timeout(std::time::Duration::from_secs(5))
.max_connections(max_connections)
.min_connections(min_connections)
.acquire_timeout(std::time::Duration::from_secs(8))
.idle_timeout(std::time::Duration::from_secs(180))
.max_lifetime(std::time::Duration::from_secs(900))
.connect(database_url)