feat(saas): add GenerateEmbedding worker for knowledge chunking

- Markdown-aware content splitting (512 token chunks with 64 overlap)
- CJK keyword extraction from chunk content with stop-word filtering
- Full refresh strategy (delete old chunks → re-insert on update)
- Phase 2 placeholder for vector embedding API integration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
iven
2026-04-02 00:23:38 +08:00
parent ef60f9a183
commit 830e9fa301
3 changed files with 157 additions and 1 deletions

View File

@@ -241,6 +241,7 @@ pub mod cleanup_refresh_tokens;
pub mod update_last_used;
pub mod record_usage;
pub mod aggregate_usage;
pub mod generate_embedding;
// 便捷导出
pub use log_operation::LogOperationWorker;