Step 4.1: Compaction debounce
- 30s cooldown between consecutive compactions
- Minimum 3 rounds (6 messages) since last compaction before re-triggering
- AtomicU64 lock-free state tracking
Step 4.2: Async compaction with cached fallback
- During cooldown, use cached result from previous compaction
- RwLock<Option<Vec<Message>>> for thread-safe cache access
- Cache updated after each successful compaction
Step 4.3: Iterative summary
- generate_summary/generate_llm_summary accept previous_summary parameter
- LLM prompt includes previous summary for cumulative context preservation
- Rule-based summary carries forward [上轮摘要保留] section
- previous_summary extracted from leading System messages in message history