Commit Graph

112 Commits

Author SHA1 Message Date
iven
b69dc6115d fix(relay): API Key 解密失败自愈 — 启动迁移 + 容错跳过
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
根因: select_best_key 遇到解密失败时直接 500 返回,
不会尝试下一个 key。如果 DB 中有旧的加密格式 key,
整个 relay 请求被阻断。

修复:
- key_pool: 解密失败时 warn + skip 到下一个 key,不再 500
- key_pool: 新增 heal_provider_keys() 启动自愈迁移
  - 逐个尝试解密所有加密 key
  - 解密成功 → 用当前密钥重新加密(幂等)
  - 解密失败 → 标记 is_active=false + warn
- main.rs: 启动时调用自愈迁移(在 TOTP 迁移之后)
2026-04-16 02:40:44 +08:00
iven
be2a136392 fix(saas): relay_tasks 超时自动清理 — 每5分钟扫描 processing >10min 标记 failed
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- scheduler.rs: 新增 start_db_cleanup_tasks 中的 relay 超时清理定时任务
- status=processing 且 updated_at 超过 10 分钟的 relay_task 自动标记为 failed
- 避免 Provider key 禁用后 relay_task 永久停留在 processing 状态
2026-04-15 01:41:50 +08:00
iven
76cdfd0c00 fix(saas): SSE 用量统计一致性修复 — 回写 usage_records + 消除 relay_requests 双重计数
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- service.rs: SSE 流结束后回写 usage_records 真实 token (status=success)
- service.rs: spawned task 中调用 increment_usage 统一递增 tokens + relay_requests
- handlers.rs: 移除 SSE 路径的 increment_dimension("relay_requests") 消除双重计数
- 从 request_body 提取 model_id 用于 usage_records 精准归因
2026-04-15 01:40:27 +08:00
iven
9c59e6e82a fix(saas): SSE relay token capture 修复 — stream_done 标志 + 前缀兼容
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- SseUsageCapture 增加 stream_done 标志,[DONE] 和 stream 结束时设置
- parse_sse_line 兼容 "data:" 和 "data: " 两种前缀
- 增加 total_tokens 兜底解析(某些 provider 不返回 prompt_tokens)
- 轮询逻辑优先检测 stream_done,而非依赖 total > 0 条件
- 超时时增加 warn 日志记录实际 token 值

根因: 上游 provider 不在 SSE chunk 中返回 usage 时,轮询稳定逻辑
(total > 0 条件) 永远不满足,导致 token 始终为 0。
2026-04-15 00:15:03 +08:00
iven
e0eb7173c5 fix: 三端联调 P1 修复 — API密钥页崩溃 + 桌面端401恢复 + 用量统计全零
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
P1-03: vite.config.ts proxy '/api' → '/api/' 加尾部斜杠,
  防止前缀匹配 /api-keys 导致 SPA 路由崩溃

P1-01: kernel_init 增加 api_key 变更检测(token 刷新后自动重连),
  streamStore 增加 401 自动恢复(refresh token → kernel reconnect),
  KernelClient 新增 getConfig() 方法

P1-02: /api/v1/usage 总计改从 billing_usage_quotas 读取
  (authoritative source,SSE 和 JSON 均写入),
  by_model/by_day 仍从 usage_records 读取
2026-04-14 22:02:02 +08:00
iven
6721a1cc6e fix(admin): 行业选择500修复 + 管理员切换订阅计划
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- fix(industry): list_industries SQL参数编号错位 — count查询和items查询
  共用WHERE子句但参数从$3开始,sqlx bind按$1/$2顺序绑定导致500
- feat(billing): 新增 PUT /admin/accounts/:id/subscription 端点 (super_admin)
  验证目标计划 → 取消当前订阅 → 创建新订阅(30天) → 同步配额
- feat(admin-v2): Accounts.tsx 编辑弹窗新增「订阅计划」选择区
  显示所有活跃计划,保存时调用admin switch plan API
2026-04-14 19:06:58 +08:00
iven
d2a0c8efc0 fix(saas): 启动崩溃修复 — config_items 约束 + industry 类型匹配
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- db.rs: config_items INSERT ON CONFLICT (id) → (category, key_path) 匹配实际唯一约束
- db.rs: fix_seed_data category 重命名前先删除冲突行,避免唯一约束冲突
- migration/service.rs: seed_default_config_items + sync push INSERT 同步修复 ON CONFLICT
- industry/types.rs: keywords_count i64→i32 匹配 PostgreSQL INT4 列类型

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 18:35:24 +08:00
iven
dd854479eb fix: 三端联调测试 2 P1 + 2 P2 + 4 P3 修复
P1-07: billing get_or_create_usage 同步 max_* 列到当前计划限额
P1-08: relay handler 增加直接配额检查 (relay_requests/input/output_tokens)
P2-09: relay failover 成功后记录 tokens 并标记 completed
P2-10: Tauri agentStore saas-relay 模式下从 SaaS API 获取真实用量
P2-14: super_admin 合成 subscription + check_quota 放行
P3-19: 新建 ApiKeys.tsx 页面替代 ModelServices 路由
P3-15: antd destroyOnClose → destroyOnHidden (3处)
P3-16: ProTable onSearch → onSubmit (2处)
2026-04-14 17:48:22 +08:00
iven
4c3136890b fix: 三端联调测试 2 P0 + 6 P1 + 2 P2 修复
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
P0-1: SaaS relay 模型别名解析 — "glm-4-flash" → "glm-4-flash-250414" (resolve_model)
P0-2: config.rs interpolate_env_vars UTF-8 修复 (chars 迭代器替代 bytes as char)
      + DB 启动编码检查 + docker-compose UTF-8 编码参数

P1-3: UI 模型选择器覆盖 Agent 默认模型 (model_override 全链路: TS→Tauri→Rust kernel)
P1-6: 知识搜索管道修复 — seed_knowledge 创建 chunks + 默认分类 (seed/uploaded/distillation)
P1-7: 用量限额从当前 Plan 读取 (非 stale usage 表)
P1-8: relay 双维度配额检查 (relay_requests + input_tokens)

P2-9: SSE 路径 token 计数修复 — 流结束检测替代固定 500ms sleep + billing increment
2026-04-14 00:17:08 +08:00
iven
c167ea4ea5 fix(v13): V13 审计 6 项修复 — TrajectoryRecorder注册 + industryStore接入 + 知识搜索 + webhook标注 + structured UI + persistent注释
FIX-01: TrajectoryRecorderMiddleware 注册到 create_middleware_chain() (@650优先级)
FIX-02: industryStore 接入 ButlerPanel 行业专长展示 + 自动拉取
FIX-03: 桌面端知识库搜索 saas-knowledge mixin + VikingPanel SaaS KB UI
FIX-04: webhook 迁移标注 deprecated + 添加 down migration 注释
FIX-05: Admin Knowledge 添加结构化数据 Tab (CRUD + 行浏览)
FIX-06: PersistentMemoryStore 精化 dead_code 标注 (完整迁移留后续)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 01:34:08 +08:00
iven
0b512a3d85 fix(industry): 三轮审计修复 — 3 HIGH + 4 MEDIUM 清零
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
H1: status 值不匹配 disabled→inactive + source 补 admin 映射 + valueEnum
H2: experience.rs format_for_injection 添加 xml_escape
H3: TriggerContext industry_keywords 接通全局缓存
M2: ID 自动生成移除中文字符保留 + 无 ASCII 时提示手动输入
M3: TS CreateIndustryRequest 添加 id? 字段
M4: ListIndustriesQuery 添加 deny_unknown_fields
2026-04-12 21:04:00 +08:00
iven
640df9937f feat(knowledge): Phase D 统一搜索 + 种子知识冷启动
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- search/recommend API 返回 UnifiedSearchResult (文档+结构化双通道)
- POST /api/v1/knowledge/seed 种子知识冷启动 (幂等, admin权限)
- seed_knowledge service: 按标题+行业查重, source=distillation
- SearchRequest 扩展: search_structured/search_documents/industry_id
2026-04-12 20:46:43 +08:00
iven
f8c5a76ce6 fix(industry): 审计收尾 — MEDIUM + LOW 全部清零
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
M-1: Industries 创建弹窗添加 cold_start_template + pain_seed_categories
M-3: industryStore console.warn → createLogger 结构化日志
B2: classify_with_industries 平局打破 + 归一化因子 3.0 文档化
S3: set_account_industries 验证移入事务内消除 TOCTOU
T1: 4 个 SaaS 请求类型添加 deny_unknown_fields
I3: store_trigger_experience Debug 格式 → signal_name 描述名
L-1: 删除 Accounts.tsx 死代码 editingIndustries
L-3: Industries.tsx filters 类型补全 source 字段
2026-04-12 20:37:48 +08:00
iven
76f6011e0f fix(industry): 二次审计修复 — 2 CRITICAL + 4 HIGH + 2 MEDIUM
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
C-1: Industries.tsx 创建弹窗缺少 id 字段 → 添加 id 输入框 + 自动生成
C-2: Accounts.tsx handleSave 无 try/catch → 包装 + handleClose 统一关闭
V1: viking_commands Mutex 跨 await → 先 clone Arc 再释放 Mutex
I1: intelligence_hooks 误导性"相关度" → 移除 access_count 伪分数
I2: pain point 摘要未 XML 转义 → xml_escape() 处理
S1: industry status 无枚举验证 → active/inactive 白名单
S2: create_industry id 无格式验证 → 正则 + 长度检查
H-3: Industries.tsx 编辑模态数据竞争 → data.id === industryId 守卫
H-4: Accounts.tsx useEffect 覆盖用户编辑 → editingId 守卫
2026-04-12 20:13:41 +08:00
iven
60062a8097 feat(knowledge): Phase B+C 文档提取器 + multipart 文件上传
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- PDF 提取 (pdf-extract) + DOCX 提取 (zip+quick-xml) + Excel 解析 (calamine)
- 统一格式路由 detect_format() → RAG 通道或结构化通道
- POST /api/v1/knowledge/upload multipart 文件上传
- PDF/DOCX/Markdown → RAG 管线,Excel → structured_rows JSONB
- 结构化数据源 CRUD API (GET/DELETE /api/v1/structured/sources)
- POST /api/v1/structured/query JSONB 关键词查询
- 修复 industry/service.rs SaasError::Database 类型不匹配
2026-04-12 19:25:24 +08:00
iven
fbc8c9fdde fix(industry): 审计修复 — 4 CRITICAL + 5 HIGH 全部解决
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
C1: SaaS industry/service.rs SQL 注入风险 → 参数化查询 ($N 绑定)
C2: INDUSTRY_CONFIGS 死链 → Kernel 共享 Arc 接通 ButlerRouter
C3: IndustryListItem 缺 keywords_count → SQL 查询 + 类型补全
C4: set_account_industries 非事务性 → batch 验证 + 事务 DELETE+INSERT
H8: Accounts.tsx mutate 竞态 → mutateAsync 顺序等待
H9: XML 注入未转义 → xml_escape() 辅助函数
H10: update_industry 覆盖 source → 保留原始值
H11: 面包屑缺少 /industries → 添加行业配置映射
2026-04-12 19:06:19 +08:00
iven
c3593d3438 feat(knowledge): Phase A 知识库可见性隔离 + 结构化数据源 + 蒸馏Worker
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- knowledge_items 增加 visibility(public/private) + account_id 字段
- 新建 structured_sources + structured_rows 表 (Excel JSONB 行级存储)
- 结构化数据源 CRUD API (5 路由: list/get/rows/delete/query)
- 安全查询: JSONB GIN 索引 + 可见性过滤 + 行数限制
- 蒸馏 Worker: 复用 Provider Key Pool 调 DeepSeek/Qwen API
- L0 质量过滤: 长度/隐私检测
- create_item 增加 is_admin 参数控制可见性默认值
- generate_embedding: extract_keywords_from_text 改为 pub 复用

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-12 18:36:05 +08:00
iven
5d1050bf6f feat(industry): Phase 1 行业配置基础 — 数据模型 + 四行业内置配置 + ButlerRouter 动态关键词
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- 新增 SaaS industry 模块 (types/service/handlers/mod/builtin)
- 4 行业内置配置: healthcare/education/garment/ecommerce
- 数据库迁移: industries + account_industries 表
- 8 个 API 端点 (CRUD + 用户行业关联)
- ButlerRouter 改造: 支持 IndustryKeywordConfig 动态注入
- 12 个测试全通过 (含动态行业分类测试)
2026-04-12 15:42:35 +08:00
iven
5599cefc41 feat(saas): 接通 embedding 模型管理全栈
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
数据库 migration 已有 is_embedding/model_type 列但全栈未使用。
打通 4 层: ModelRow → ModelInfo/CRUD → CachedModel → Admin 前端。
relay/models 端点也返回 is_embedding 字段,前端可按类型过滤。
2026-04-12 08:10:50 +08:00
iven
25a4d4e9d5 fix(saas): 新用户 llm_routing 默认改为 relay 使 SaaS token pool 成为主路径
Some checks failed
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
- handlers.rs: SQL INSERT 和 LoginResponse 中 'local' → 'relay'
- 新增 migration: ALTER llm_routing SET DEFAULT 'relay'
- 符合管家式服务理念:用户无需配置 API Key,SaaS 自动中转
2026-04-11 02:05:27 +08:00
iven
88cac9557b fix(saas): P0-2/P0-3 — usage endpoint + refresh token type mismatch
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
P0-2: GET /usage 500 "text >= timestamptz" — usage_records.created_at
is TEXT in actual DB despite migration declaring TIMESTAMPTZ. Fixed by
using dynamic SQL with ::timestamptz explicit casts for all date
comparisons, avoiding sqlx NULL-without-type-OID binding issues.

P0-3: POST /auth/refresh 500 — refresh_tokens.expires_at/used_at are
TEXT columns. Added ::timestamptz cast to SQL queries in auth handlers
and cleanup worker.
2026-04-10 16:25:52 +08:00
iven
b0e6654944 fix: P0-01/P1-01/P1-03 — account lockout, token revocation, optional display_name
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- P0-01: Account lockout now enforced via SQL-level comparison
  (locked_until > NOW()) instead of broken RFC3339 text parsing
- P1-01: Logout handler accepts JSON body with optional refresh_token,
  revokes ALL refresh tokens for the account (not just current)
- P1-03: Provider display_name is now optional, falls back to name

All 6 smoke tests pass (S1-S6).
2026-04-10 12:13:53 +08:00
iven
99262efca4 test: execute 30 smoke tests + fix P0 CSS break + BREAKS.md report
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Layer 1 break detection results (21/30 pass, 63%):
- SaaS API: 5/5 pass (S3 skip no LLM key)
- Admin V2: 5/6 pass (A6 flaky auth guard)
- Desktop Chat: 3/6 pass (D1 no chat response in browser; D2/D3 skip non-Tauri)
- Desktop Feature: 6/6 pass
- Cross-System: 2/6 pass (4 blocked by login rate limit 429)

Bugs found:
- P0-01: Account lockout not enforced (locked_until set but not checked)
- P1-01: Refresh token still valid after logout
- P1-02: Desktop browser chat no response (stores not exposed)
- P1-03: Provider API requires display_name (undocumented)

Fixes applied:
- desktop/src/index.css: @import -> @plugin for Tailwind v4 compatibility
- Admin tests: correct credentials admin/admin123 from .env
- Cross tests: correct dashboard endpoint /stats/dashboard
2026-04-10 11:26:13 +08:00
iven
2e70e1a3f8 test: add 30 smoke tests for break detection across SaaS/Admin/Desktop
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Layer 1 断裂探测矩阵:
- S1-S6: SaaS API 端到端 (auth/lockout/relay/permissions/billing/knowledge)
- A1-A6: Admin V2 连通性 (login/dashboard/CRUD/knowledge/roles/models)
- D1-D6: Desktop 聊天流 (gateway/kernel/relay/cancel/offline/error)
- F1-F6: Desktop 功能闭环 (agent/hands/pipeline/memory/butler/skills)
- X1-X6: 跨系统闭环 (provider→desktop/disabled user/knowledge/stats/totp/billing)

Also adds: admin-v2 Playwright config, updated spec doc with cross-reference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 09:47:35 +08:00
iven
ffa137eff6 test(saas): add 8 model config extended tests — encryption, groups, quota
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- API Key encryption at rest: verify enc: prefix in DB for provider keys
  and main provider api_key
- Key pool: toggle active/inactive + delete with DB state verification
- Model Groups: full CRUD lifecycle + cascade delete + user permission
- Quota enforcement: relay_requests exhaustion verified at DB level
  (middleware test infra issue noted — DB state confirmed correct)
- Provider disable: model hidden from relay/models list after disable
2026-04-10 09:20:06 +08:00
iven
c37c7218c2 test(saas): add 36 security/validation/permission tests (184 total, 0 failures)
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
New test files:
- auth_security_test.rs (12): account lockout DB state, lockout reset,
  password version invalidation, disabled account, refresh token
  revocation, boundary validation (username/password), role enforcement,
  TOTP 2FA flow
- account_security_test.rs (9): role management, privilege escalation
  prevention, account disable/enable, cross-account access control,
  operation logs
- relay_validation_test.rs (8): input validation (missing fields, empty
  messages, invalid roles), disabled provider, model listing, task
  isolation
- permission_matrix_test.rs (7): super_admin full access, user allowed/
  forbidden endpoints, public endpoints, unauthenticated rejection,
  API token lifecycle

Discovered: account lockout runtime check broken — handlers.rs:213
parse_from_rfc3339 fails on PostgreSQL TIMESTAMPTZ::TEXT format,
silently skipping lockout. DB state is correct but login not rejected.
2026-04-10 08:11:02 +08:00
iven
ba586e5aa7 fix: BUG-009/010/011 — DataMasking, cancel button, SQL casts
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
BUG-009 (P1): Add frontend DataMasking in saas-relay-client.ts
- Masks ID cards, phones, emails, money, company names before relay
- Unmasks tokens in AI response so user sees original data
- Mirrors Rust DataMasking middleware patterns

BUG-010 (P3): Send button transforms to Stop during streaming
- Shows square icon when isStreaming, calls cancelStream()
- Normal arrow icon when idle, calls handleSend()

BUG-011 (P2): Add ::timestamptz casts for old TEXT timestamp columns
- account/handlers.rs: dashboard stats query
- telemetry/service.rs: reported_at comparisons
- workers/aggregate_usage.rs: usage aggregation query
2026-04-09 23:45:19 +08:00
iven
bf728c34f3 fix: saasStore require() bug + health check pool formula + DEV error details
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- saasStore.ts: replace require('./chat/conversationStore') with await import()
  to fix ReferenceError in Vite ESM environment (P1)
- main.rs: fix health check pool usage formula from max_connections - num_idle
  to pool.size() - num_idle, preventing false "degraded" status (P1)
- error.rs: show detailed error messages in ZCLAW_SAAS_DEV=true mode
- Update bug tracker with BUG-003 through BUG-007
2026-04-09 22:23:05 +08:00
iven
bd6cf8e05f fix(saas): add ::bigint cast to all SUM() aggregates for PG NUMERIC compat
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
PostgreSQL SUM() on bigint returns NUMERIC, causing sqlx decode errors
when Rust expects i64/Option<i64>. Root cause: key_pool.rs
select_best_key() token_count SUM was missing ::bigint, causing
DATABASE_ERROR on every relay request.

Fixed in 4 files:
- relay/key_pool.rs: SUM(token_count) — root cause of relay failure
- relay/service.rs: SUM(remaining_rpm) in sort_candidates_by_quota
- account/handlers.rs: SUM(input/output_tokens) in dashboard stats
- workers/aggregate_usage.rs: SUM(input/output_tokens) in aggregation
2026-04-09 22:16:27 +08:00
iven
a081a97678 fix(relay): audit fixes — abort signal, model selector guard, SSE CRLF, SQL format
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Addresses findings from deep code audit:

H-1: Pass abortController.signal to saasClient.chatCompletion() so
     user-cancelled streams actually abort the HTTP connection (was only
     stopping the read loop, leaving server-side SSE connection open).

H-2: ModelSelector now shows only when (!isTauriRuntime() || isLoggedIn).
     Prevents decorative model list in Tauri local kernel mode where model
     selection has no effect (violates CLAUDE.md §5.2).

M-1: Normalize CRLF to LF before SSE event boundary parsing (\n\n).
     Prevents buffer overflow when behind nginx/CDN with CRLF line endings.

M-2: SQL window_minute comparison uses to_char(NOW()-interval, format)
     instead of (NOW()-interval)::TEXT, matching the stored format exactly.

M-3: sort_candidates_by_quota uses same sliding 60s window as select_best_key.

LOW: Fix misleading invalidate_cache doc comment.
2026-04-09 19:51:34 +08:00
iven
e6eb97dcaa perf(relay): full-chain optimization — key pool, model sync, SSE stream
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Phase 1 (Key Pool correctness):
- RPM: fixed-minute window → sliding 60s aggregation (prevents 2x burst)
- Remove fallback-to-provider-key bypass when all keys rate-limited
- SSE semaphore: 16→64 permits, cleanup delay 60s→5s
- Default 429 cooldown: 5min→60s (better for Coding Plan quotas)
- Expire old key_usage_window rows on record

Phase 2 (Frontend model sync):
- currentModel empty-string fallback to glm-4-flash-250414 in relay client
- Merge duplicate listModels() calls in connectionStore SaaS path
- Show ModelSelector in Tauri mode when models available
- Clear currentModel on SaaS logout

Phase 3 (Relay performance):
- Key Pool: DashMap in-memory cache (TTL 5s) for select_best_key
- Cache invalidation on 429 marking

Phase 4 (SSE stream):
- AbortController integration for user-cancelled streams
- SSE parsing: split by event boundaries (\n\n) instead of per-line
- streamStore cancelStream adapts to 0-arg and 1-arg cancel fns
2026-04-09 19:34:02 +08:00
iven
0883bb28ff fix: validation hardening — agent import prompt limit, relay retry tracking, heartbeat validation
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- agent_import: add system_prompt length validation (max 50K chars)
  to prevent excessive token consumption from imported configs
- relay retry_task: wrap JoinHandle to log abort on server shutdown
- device_heartbeat: validate device_id length (1-64 chars) matching
  register endpoint constraints
2026-04-09 17:24:36 +08:00
iven
3f2acb49fb fix: pre-release audit fixes — Twitter OAuth, DataMasking perf, Prompt versioning
- Twitter like/retweet: return explicit unavailable error instead of
  sending doomed Bearer token requests (would 403 on Twitter API v2)
- DataMasking: pre-compile regex patterns with LazyLock (was compiling
  6 patterns on every mask() call)
- Prompt version: fix get_version handler ignoring version path param,
  add service::get_version_by_number for correct per-version retrieval
2026-04-09 16:43:24 +08:00
iven
ade534d1ce feat: 添加MCP调试插件并优化流式超时处理
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
refactor(relay): 将Provider Key管理路由移至model_config模块
fix(saas): 修复demo_keys与provider_keys的匹配逻辑
perf(runtime): 将流式响应超时从60秒延长至180秒以适配思考型模型
docs: 新增模块化审计和上线前功能测试方案文档
chore: 添加tauri-plugin-mcp依赖及相关配置
2026-04-08 13:39:06 +08:00
iven
eab9b5fdcc fix(saas): WorkerDispatcher registration race — consumer starts after all workers registered
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Root cause: start_consumer() was called in new() before any register() calls,
so the consumer's cloned HashMap was always empty. Workers like log_operation
and record_usage were never found, causing "Unknown worker" errors.

- Add WorkerDispatcher::start() method to be called after all register()s
- Update main.rs to call dispatcher.start() after 7 workers registered
2026-04-08 08:33:54 +08:00
iven
f9303ae0c3 fix(saas): SQL type cast fixes for E2E relay flow
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- key_pool.rs: cast cooldown_until to timestamptz for comparison with NOW()
- key_pool.rs: cast request_count to bigint (INT4→INT8) for sqlx decoding
- service.rs: cast cooldown_until to timestamptz in quota sort query
- scheduler.rs: cast last_seen_at to timestamptz in device cleanup
- totp.rs: use DateTime<Utc> instead of rfc3339 string for updated_at
2026-04-07 22:24:19 +08:00
iven
ab0e11a719 fix(saas): Phase 5 regression fixes — SQL type casts + test data corrections
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- Fix usage_stats SQL: add ::timestamptz cast for Option<String> params
- Fix usage_stats SQL: add ::bigint cast for COALESCE(SUM(...))
- Fix telemetry INSERT: add ::timestamptz cast for reported_at column
- Fix config_analysis_empty test: seed data makes total_items > 0
- Fix key_pool_crud test: key_value must be >= 20 chars
- Fix SkillManifest test helpers: add missing tools field

All 1048 tests pass: 580 Rust + 138 SaaS + 330 Desktop Vitest
2026-04-07 19:21:45 +08:00
iven
7de486bfca test(saas): Phase 1 integration tests — billing + scheduled_task + knowledge (68 tests)
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- Fix TIMESTAMPTZ decode errors: add ::TEXT cast to all SELECT queries
  where Row structs use String for TIMESTAMPTZ columns (~22 locations)
- Fix Axum 0.7 route params: {id} → :id in billing/knowledge/scheduled_task routes
- Fix JSONB bind: scheduled_task INSERT uses ::jsonb cast for input_payload
- Add billing_test.rs (14 tests): plans, subscription, usage, payments, invoices
- Add scheduled_task_test.rs (12 tests): CRUD, validation, isolation
- Add knowledge_test.rs (20 tests): categories, items, versions, search, analytics, permissions
- Fix auth test regression: 6 tests were failing due to TIMESTAMPTZ type mismatch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 14:25:34 +08:00
iven
2fd6d08899 fix: SaaS Admin + Tauri 一致性审查修复
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- 删除 webhook 死代码模块 (4 文件 + worker,未注册未挂载)
- 删除孤立组件 StatusTag.tsx (从未被导入)
- authStore 权限模型补全 (scheduler/knowledge/billing 6+ permission key)
- authStore 硬编码 logout URL 改为 env 变量
- 清理未使用 service 方法 (agent-templates/billing/roles)
- Logs.tsx 代码重复消除 (本地常量 → @/constants/status)
- TRUTH.md 数字校准 (Tauri 177→183, SaaS API 131→130)
2026-04-07 01:53:54 +08:00
iven
828be3cc9e fix: resolve 6 remaining defects (P2-18, P2-21, P3-04, P3-05, P3-06, P3-02)
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- P2-18: TOTP QR code local generation via qrcode lib (no external service)
- P2-21: Suspend foreign LLM providers (OpenAI/Anthropic/Gemini) for early stage
- P3-04: get_progress() now calculates actual percentage from completed/total steps
- P3-05: saveSaaSSession calls now have .catch() error logging
- P3-06: SaaS relay chatStream passes session_key/agent_id to backend
- P3-02: Whiteboard unification plan document created

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-06 09:52:28 +08:00
iven
26a833d1c8 fix: resolve 17 P2 defects and 5 P3 defects from pre-launch audit
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Batch fix covering multiple modules:
- P2-01: HandRegistry Semaphore-based max_concurrent enforcement
- P2-03: Populate toolCount/metricCount from Hand trait methods
- P2-06: heartbeat_update_config minimum interval validation
- P2-07: ReflectionResult used_fallback marker for rule-based fallback
- P2-08/09: identity_propose_change parameter naming consistency
- P2-10: ClassroomMetadata is_placeholder flag for LLM failure
- P2-11: classroomStore userDidCloseDuringGeneration intent tracking
- P2-12: workflowStore pipeline_create sends actionType
- P2-13/14: PipelineInfo step_count + PipelineStepInfo for proper step mapping
- P2-15: Pipe transform support in context.resolve (8 transforms)
- P2-16: Mustache {{...}} → \${...} auto-normalization
- P2-17: SaaSLogin password placeholder 6→8
- P2-19: serialize_skill_md + update_skill preserve tools field
- P2-22: ToolOutputGuard sensitive patterns from warn→block
- P2-23: Mutex::unwrap() → unwrap_or_else in relay/service.rs
- P3-01/03/07/08/09: Various P3 fixes
- DEFECT_LIST.md: comprehensive status sync (43/51 fixed, 8 remaining)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-06 00:49:16 +08:00
iven
90855dc83e fix(desktop): resolve 2 release-blocking P1 defects
P1-04: GenerationPipeline hardcoded model="default" causing classroom
generation 404. Added model field to GenerationPipeline struct, passed
from kernel config via with_driver(driver, model). Static scene
generation now receives model parameter.

P1-03: LLM API concurrent 500 DATABASE_ERROR. Added transient DB error
retry (PoolTimedOut/Io) in create_relay_task with 200ms backoff.
Recommend setting ZCLAW_DB_MIN_CONNECTIONS=10 for burst resilience.
2026-04-05 19:18:41 +08:00
iven
de36bb0724 fix(saas): migration idempotency fixes + SCHEMA_VERSION bump to 14
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- Add IF NOT EXISTS to accounts_template_assignment ALTER COLUMN
- Add IF NOT EXISTS to webhooks CREATE INDEX statements
- Add created_at/updated_at columns + ON CONFLICT DO NOTHING to industry templates
- Bump SCHEMA_VERSION 13→14 to force migration re-run on existing DB
2026-04-05 08:19:10 +08:00
iven
d6b1f44119 feat(admin): add ConfigSync page + close ADMIN-01/02 (AUDIT_TRACKER)
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- ADMIN-01 FIXED: ConfigSync.tsx page with ProTable + pagination
  - config-sync service calling GET /config/sync-logs
  - route + nav item + breadcrumb
  - backend @reserved → @connected
- ADMIN-02 FALSE_POSITIVE: Logs.tsx + logs service already exist
2026-04-05 01:40:38 +08:00
iven
745c2fd754 feat(saas): add down migrations for all incremental schema changes (AUD3-DB-01)
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- 16 down SQL files in migrations/down/ for each incremental migration
- db::run_down_migrations() executes rollback files in reverse order
- migrate_down CLI task: task=migrate_down timestamp=20260402
- Initial schema and seed data excluded (would be destructive)
2026-04-05 01:35:33 +08:00
iven
e90eb5df60 feat: Sprint 3 — benchmark + conversion funnel + invoice PDF
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- 3.1: Add criterion benchmark for zclaw-growth TF-IDF retrieval
  (indexing throughput, query scoring latency, top-K retrieval)
- 3.2: Extend admin-v2 Usage page with recharts funnel chart
  (registration → trial → paid conversion) and daily trend bar chart
- 3.3: Add invoice PDF export via genpdf (Arial font, Windows)
  with GET /api/v1/billing/invoices/{id}/pdf handler

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 14:42:29 +08:00
iven
5c48d62f7e fix(saas): harden model group failover + relay reliability
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- cache: insert-then-retain pattern avoids empty-window race during refresh
- relay: manage_task_status flag for proper failover state transitions
- relay: retry_task re-resolves model groups instead of blind provider reuse
- relay: filter empty-member groups from available models list
- relay: quota cache stale entry cleanup (TTL 5x expiry)
- error: from_sqlx_unique helper for 409 vs 500 distinction
- model_config: unique constraint handling, duplicate member check
- model_config: failover_strategy whitelist, model_id vs group name conflict check
- model_config: group-scoped member removal with group_id validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 12:26:55 +08:00
iven
894c0d7b15 feat(desktop): pipeline result preview + industry templates + onboarding auto-trigger
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Sprint 2: 产品体验打磨 + 行业模板

- Create PipelineResultPreview component with tab-based output switching
- Connect workflow/hand messages to PresentationContainer in ChatArea
- Add auto-trigger first Hand after onboarding (industry-specific queries)
- Seed 3 industry agent templates (education, healthcare, design-shantou)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 10:48:47 +08:00
iven
be0a78a523 feat(saas): add model groups for cross-provider failover
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
Model Groups provide logical model names that map to multiple physical
models across providers, with automatic failover when one provider's
key pool is exhausted.

Backend:
- New model_groups + model_group_members tables with FK constraints
- Full CRUD API (7 endpoints) with admin-only write permissions
- Cache layer: DashMap-backed CachedModelGroup with load_from_db
- Relay integration: ModelResolution enum for Direct/Group routing
- Cross-provider failover: sort_candidates_by_quota + OnceLock cache
- Relay failure path: record failure usage + relay_dequeue (fixes
  queue counter leak that caused connection pool exhaustion)
- add_group_member: validate model_id exists before insert

Frontend:
- saas-relay-client: accept getModel() callback for dynamic model selection
- connectionStore: prefer conversationStore.currentModel over first available

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 09:56:21 +08:00
iven
8faefd6a61 fix(tests): resolve workspace compilation + CJK search failures
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled
- saas test harness: align WorkerDispatcher::new and AppState::new
  signatures with SpawnLimiter addition and init_db(&DatabaseConfig)
- growth sqlite: add CJK fallback (LIKE-based) when FTS5 unicode61
  tokenizer fails on Chinese queries (unicode61 doesn't index CJK)
2026-04-04 00:34:34 +08:00