feat(billing): activate real-time quota enforcement pipeline

- Wire relay handler to increment_usage() for JSON responses (tokens + relay_requests)
- Wire relay handler to increment_dimension("relay_requests") for SSE streams
- Add increment_dimension() function for hand_executions/pipeline_runs dimensions
- Schedule AggregateUsageWorker hourly for reconciliation (run_on_start=true)
- Mount mock payment routes in dev mode (ZCLAW_SAAS_DEV=true)

Previously the quota middleware always allowed requests because usage
counters were never incremented. Now relay requests update billing_usage_quotas
in real-time, with the aggregator providing hourly reconciliation.
This commit is contained in:
iven
2026-04-02 01:52:01 +08:00
parent 8263b236fd
commit 11e3d37468
4 changed files with 131 additions and 54 deletions

View File

@@ -386,9 +386,22 @@ async fn build_router(state: AppState) -> axum::Router {
zclaw_saas::auth::auth_middleware,
));
axum::Router::new()
let mut router = axum::Router::new()
.merge(non_streaming_routes)
.merge(relay_routes)
.merge(relay_routes);
// 开发模式挂载 mock 支付页面
{
let is_dev = std::env::var("ZCLAW_SAAS_DEV")
.map(|v| v == "true" || v == "1")
.unwrap_or(false);
if is_dev {
router = router.merge(zclaw_saas::billing::mock_routes());
info!("Mock payment routes mounted (dev mode)");
}
}
router
.layer(TraceLayer::new_for_http())
.layer(cors)
.with_state(state)