feat(knowledge): Phase B+C 文档提取器 + multipart 文件上传
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled

- PDF 提取 (pdf-extract) + DOCX 提取 (zip+quick-xml) + Excel 解析 (calamine)
- 统一格式路由 detect_format() → RAG 通道或结构化通道
- POST /api/v1/knowledge/upload multipart 文件上传
- PDF/DOCX/Markdown → RAG 管线,Excel → structured_rows JSONB
- 结构化数据源 CRUD API (GET/DELETE /api/v1/structured/sources)
- POST /api/v1/structured/query JSONB 关键词查询
- 修复 industry/service.rs SaasError::Database 类型不匹配
This commit is contained in:
iven
2026-04-12 19:25:24 +08:00
parent 4800f89467
commit 60062a8097
7 changed files with 849 additions and 8 deletions

View File

@@ -103,7 +103,7 @@ wasmtime-wasi = { version = "43" }
tempfile = "3"
# SaaS dependencies
axum = { version = "0.7", features = ["macros"] }
axum = { version = "0.7", features = ["macros", "multipart"] }
axum-extra = { version = "0.9", features = ["typed-header", "cookie"] }
tower = { version = "0.4", features = ["util"] }
tower-http = { version = "0.5", features = ["cors", "trace", "limit", "timeout"] }
@@ -112,6 +112,12 @@ argon2 = "0.5"
totp-rs = "5"
hex = "0.4"
# Document processing
pdf-extract = "0.7"
calamine = "0.26"
quick-xml = "0.37"
zip = "2"
# TCP socket configuration
socket2 = { version = "0.5", features = ["all"] }