feat(knowledge): Phase B+C 文档提取器 + multipart 文件上传
Some checks failed
CI / Lint & TypeCheck (push) Has been cancelled
CI / Unit Tests (push) Has been cancelled
CI / Build Frontend (push) Has been cancelled
CI / Rust Check (push) Has been cancelled
CI / Security Scan (push) Has been cancelled
CI / E2E Tests (push) Has been cancelled

- PDF 提取 (pdf-extract) + DOCX 提取 (zip+quick-xml) + Excel 解析 (calamine)
- 统一格式路由 detect_format() → RAG 通道或结构化通道
- POST /api/v1/knowledge/upload multipart 文件上传
- PDF/DOCX/Markdown → RAG 管线,Excel → structured_rows JSONB
- 结构化数据源 CRUD API (GET/DELETE /api/v1/structured/sources)
- POST /api/v1/structured/query JSONB 关键词查询
- 修复 industry/service.rs SaasError::Database 类型不匹配
This commit is contained in:
iven
2026-04-12 19:25:24 +08:00
parent 4800f89467
commit 60062a8097
7 changed files with 849 additions and 8 deletions

View File

@@ -53,5 +53,11 @@ bytes = { workspace = true }
async-stream = { workspace = true }
genpdf = "0.2"
# Document processing
pdf-extract = { workspace = true }
calamine = { workspace = true }
quick-xml = { workspace = true }
zip = { workspace = true }
[dev-dependencies]
tempfile = { workspace = true }