18 Commits

Author SHA1 Message Date
User
e145f1d97e feat(s2s-text): dedicated text-mode prompt + Markdown rendering
Architecture fix: voice and text mode now have completely separate prompts.

Backend:
- VoiceAssistantProfileSupport.buildTextSystemRole: dedicated text-mode system
  role that inherits all business rules (identity, KB-first, sensitive topics,
  sales guidance, personal info) but removes voice-specific constraints (short
  sentences, colloquial, single-line conclusion).
- DEFAULT_TEXT_SPEAKING_STYLE: text-specific style demanding detailed,
  structured, Markdown-formatted answers with complete information.
- VoiceGatewayService.handleStart: switch between voice/text system role and
  speaking style based on state.textMode.
- VoiceGatewayService.buildStartSessionPayload: preserve Markdown in text mode
  (voice mode still strips asterisks/backticks via normalizeTextForSpeech to
  avoid TTS pronouncing format chars).

Frontend:
- Added react-markdown@9 + remark-gfm@4 dependencies.
- ChatPanel renders assistant messages (non-voice) with ReactMarkdown:
  headings, lists (ul/ol), bold, italic, inline/block code, tables, blockquote,
  links, horizontal rules — all styled with Tailwind classes matching the dark
  theme.
- User messages and voice-handoff messages remain plain text.

Verification: mvn test VoiceGatewaySmokeTest 20/20 pass, vite build succeeds.
2026-04-17 10:10:20 +08:00
User
4b78f81cbc fix(s2s-text): 9 review bugs - text stream, loading, history, unmount safety
Backend (VoiceGatewayService):
- [P0 Bug-1] handleAssistantChunk/Final: text mode must never apply blockUpstreamAudio
  gate to text events (blockUpstreamAudio is for audio frames only). Non-KB text
  queries now correctly stream subtitle back to client.
- [P0 Bug-3] sendUpstreamChatTextQuery: when upstream not ready, send
  assistant_pending:false before error so client loading spinner can clear.
- [P1 Bug-6] handleUserPartial/handleUserFinal: early-return if textMode, guard
  against spurious ASR echoes from S2S.

Frontend (ChatPanel S2S effect):
- [P0 Bug-2] Connection success now clears error; added cancelled flag to all
  async setState paths to prevent state reversal on unmount.
- [P1 Bug-4] onAssistantPending: (false) always clears isLoading; (true) only
  sets isLoading if not already streaming (streamingId drives UI, pending is
  advisory).
- [P1 Bug-5] S2S mode loads session history via getSessionHistory (was Coze-only).
- [P2 Bug-8] When s2sService ref is null, also remove the placeholder assistant
  bubble to avoid stale empty bubble in chat.
- [P2 Bug-9] All callbacks guard on cancelled flag to prevent React setState
  warnings after unmount (cleanup triggers svc.disconnect which emits
  'disconnected' state).

Verification: mvn test VoiceGatewaySmokeTest 20/20 pass, no voice regression.
2026-04-17 09:44:36 +08:00
User
3e72cd54d3 feat(app): add textEngine toggle for chat mode (Coze ↔ S2S)
- localStorage-persistent textEngine state ('coze' | 's2s')
- Header button toggles between the two engines when in chat mode
- ChatPanel remounts on engine switch via key=sessionId-textEngine
- Voice mode completely unaffected
2026-04-17 09:36:13 +08:00
User
af9faf26c9 feat(s2s): add S2S text dialog via /ws/realtime-text + event 501 ChatTextQuery
Dual-channel S2S architecture with full isolation between voice and text links:

Backend (Java):
- VolcRealtimeProtocol: add createChatTextQueryMessage (event 501)
- VoiceSessionState: add textMode / playAudioReply / disableGreeting fields
- VoiceWebSocketConfig: register second path /ws/realtime-text (same handler)
- VoiceWebSocketHandler: detect text mode from URL path
- VoiceGatewayService:
  * afterConnectionEstablished: overload with textMode flag
  * handleStart: parse playAudioReply / disableGreeting from client
  * buildStartSessionPayload: inject input_mod=text for text mode
  * handleDirectText: text mode sends event 501 directly, skip processReply
  * handleBinaryMessage: reject client audio in text mode
  * handleUpstreamBinary: drop S2S audio if text mode + no playback
  * startAudioKeepalive: skip entirely in text mode (no audio channel)
  * sendGreeting: skip greeting if disableGreeting=true

Frontend (test2 + delivery):
- nativeVoiceService: connect accepts clientMode/playAudioReply/disableGreeting
  * resolveWebSocketUrl accepts wsPath param
  * Text mode: no microphone capture, no playback context (unless playAudioReply)
  * New sendText() method for event 501 payload
  * handleAudioMessage drops audio in text mode without playback
  * Export NativeVoiceService class for multi-instance usage
- ChatPanel (test2): new useS2S / playAudioReply props
  * useS2S=true: creates NativeVoiceService instance, connects to /ws/realtime-text
  * subtitle events drive streaming UI, assistant_pending drives loading state
  * handleSend routes to WebSocket in S2S mode, HTTP/SSE in Coze mode
  * Voice link code path zero-changed

Verification: mvn test VoiceGatewaySmokeTest 20/20 pass, voice link regression-free
2026-04-17 09:33:56 +08:00
User
ff6a63147b fix(voice-gateway): S2S idle timeout + upstream send lock + iOS AudioContext suspended + port 3012→3013
- P0: S2S DialogAudioIdleTimeoutError now notifies client instead of force-closing, sets upstreamReady=false and cancels keepalive
- P0: Reduce audioKeepaliveIntervalMs from 20s to 8s to prevent S2S idle timeout
- P1: Add upstreamSendLock to prevent concurrent IllegalStateException: Send pending
- P1: iOS AudioContext suspended handling - buffer audio chunks and try resume after user interaction
- P1: disconnect() clears pendingAudioChunks and _resuming to prevent memory leak
- Fix: Frontend hardcoded port 3012→3013 in videoApi.js and vite.config.js
- Add complete Java backend source code to git tracking
2026-04-16 19:16:11 +08:00
User
fe25229de7 feat: conversation long-term memory + fix source ENUM bug
- New: conversationSummarizer.js (LLM summary every 3 turns, loadBestSummary, persistFinalSummary)
- db/index.js: conversation_summaries table, upsertConversationSummary, getSessionSummary
- redisClient.js: setSummary/getSummary (TTL 2h)
- nativeVoiceGateway.js: _turnCount tracking, trigger summarize, persist on close
- realtimeDialogRouting.js: inject summary context, reduce history 5->3 rounds
- Fix: messages source ENUM missing 'search_knowledge' causing chat DB writes to fail
2026-04-03 10:19:16 +08:00
User
5b824cd16a refactor(server): optimize KB retrieval and voice context 2026-03-31 09:46:40 +08:00
User
56940676f6 feat(kb): VikingDB纯检索+重排+Redis上下文+全库搜索+别名扩展+KB保护窗口+RAG语气引导
- 新增 kbRetriever.js: VikingDB search_knowledge 纯检索替代 Ark chat/completions, doubao-seed-rerank 重排, RAG payload 语气引导缓解音色差异

- 新增 redisClient.js: Redis 连接管理 + 5轮对话历史 + KB缓存双写

- toolExecutor.js: 产品别名扩展25条, 全库检索topK=25, 检索阈值0.01, 精简 buildDeterministicKnowledgeQuery

- nativeVoiceGateway.js: isPureChitchat扩展, KB保护窗口60s, prequery参数调优

- realtimeDialogRouting.js: resolveReply感知KB保护窗口, fast-path适配raw模式

- app.js: 健康检查新增 redis/reranker/kbRetrievalMode

- 新增测试: alias A/B测试, KB retriever测试, Redis客户端测试, raw模式集成测试
2026-03-26 14:30:32 +08:00
User
9567eb7358 feat(server): KB prompt优化、字幕修复、S2S重连、助手配置API
- assistantProfileConfig: KB answer prompt改为分层策略(严格产品信息+灵活常识补充)
- nativeVoiceGateway: S2S upstream自动重连(最多50次)、event 351字幕debounce(800ms取最长文本)
- toolExecutor: 确定性query改写增强、KB查询传递session上下文
- contextKeywordTracker: 支持KB话题记忆优先enrichment
- contentSafeGuard: 新增品牌安全内容过滤服务
- assistantProfileService: 新增助手配置CRUD服务
- routes/assistantProfile: 新增助手配置API路由
- knowledgeKeywords: 扩展KB关键词词典
- fastAsrCorrector: ASR纠错规则更新
- tests/: KB prompt测试、保护窗口测试、Viking性能测试
- docs/: 助手配置API文档、系统提示词目录
2026-03-24 17:19:36 +08:00
User
57a03677a9 fix(voice-kb): sync assistant profile and stabilize reply flow 2026-03-23 13:58:41 +08:00
User
93b8135d51 feat(kb-routing): expand 5-way keyword routing coverage 2026-03-20 10:56:29 +08:00
User
d13084cc0f fix(test2): 稳定语音知识库回复并补齐热门问法覆盖 2026-03-18 17:43:13 +08:00
User
c0f038b9b3 fix(test2): 修复双重回答bug - blockUpstreamAudio按ttsType区分 + 更新ARK endpoint 2026-03-18 16:32:27 +08:00
User
0560db1048 fix: 品牌保护+知识库全量覆盖 - 6层防御解决传销问题 + 30+产品关键词补全 2026-03-17 11:00:09 +08:00
User
f97dd7e3d5 fix(test2): 修复语音欢迎语时序与重复回答持久化 2026-03-16 14:43:51 +08:00
User
5521b673f5 feat: 添加realtime_dialog和realtime_dialog_external_rag_test项目,更新test2项目 2026-03-13 13:06:46 +08:00
User
9dab61345c Update code 2026-03-12 12:47:56 +08:00
AI Knowledge Splitter
92e7fc5bda Initial commit: AI 知识库文档智能分块工具 2026-03-02 17:38:28 +08:00