8 Commits

Author SHA1 Message Date
User
e145f1d97e feat(s2s-text): dedicated text-mode prompt + Markdown rendering
Architecture fix: voice and text mode now have completely separate prompts.

Backend:
- VoiceAssistantProfileSupport.buildTextSystemRole: dedicated text-mode system
  role that inherits all business rules (identity, KB-first, sensitive topics,
  sales guidance, personal info) but removes voice-specific constraints (short
  sentences, colloquial, single-line conclusion).
- DEFAULT_TEXT_SPEAKING_STYLE: text-specific style demanding detailed,
  structured, Markdown-formatted answers with complete information.
- VoiceGatewayService.handleStart: switch between voice/text system role and
  speaking style based on state.textMode.
- VoiceGatewayService.buildStartSessionPayload: preserve Markdown in text mode
  (voice mode still strips asterisks/backticks via normalizeTextForSpeech to
  avoid TTS pronouncing format chars).

Frontend:
- Added react-markdown@9 + remark-gfm@4 dependencies.
- ChatPanel renders assistant messages (non-voice) with ReactMarkdown:
  headings, lists (ul/ol), bold, italic, inline/block code, tables, blockquote,
  links, horizontal rules — all styled with Tailwind classes matching the dark
  theme.
- User messages and voice-handoff messages remain plain text.

Verification: mvn test VoiceGatewaySmokeTest 20/20 pass, vite build succeeds.
2026-04-17 10:10:20 +08:00
User
4b78f81cbc fix(s2s-text): 9 review bugs - text stream, loading, history, unmount safety
Backend (VoiceGatewayService):
- [P0 Bug-1] handleAssistantChunk/Final: text mode must never apply blockUpstreamAudio
  gate to text events (blockUpstreamAudio is for audio frames only). Non-KB text
  queries now correctly stream subtitle back to client.
- [P0 Bug-3] sendUpstreamChatTextQuery: when upstream not ready, send
  assistant_pending:false before error so client loading spinner can clear.
- [P1 Bug-6] handleUserPartial/handleUserFinal: early-return if textMode, guard
  against spurious ASR echoes from S2S.

Frontend (ChatPanel S2S effect):
- [P0 Bug-2] Connection success now clears error; added cancelled flag to all
  async setState paths to prevent state reversal on unmount.
- [P1 Bug-4] onAssistantPending: (false) always clears isLoading; (true) only
  sets isLoading if not already streaming (streamingId drives UI, pending is
  advisory).
- [P1 Bug-5] S2S mode loads session history via getSessionHistory (was Coze-only).
- [P2 Bug-8] When s2sService ref is null, also remove the placeholder assistant
  bubble to avoid stale empty bubble in chat.
- [P2 Bug-9] All callbacks guard on cancelled flag to prevent React setState
  warnings after unmount (cleanup triggers svc.disconnect which emits
  'disconnected' state).

Verification: mvn test VoiceGatewaySmokeTest 20/20 pass, no voice regression.
2026-04-17 09:44:36 +08:00
User
3e72cd54d3 feat(app): add textEngine toggle for chat mode (Coze ↔ S2S)
- localStorage-persistent textEngine state ('coze' | 's2s')
- Header button toggles between the two engines when in chat mode
- ChatPanel remounts on engine switch via key=sessionId-textEngine
- Voice mode completely unaffected
2026-04-17 09:36:13 +08:00
User
af9faf26c9 feat(s2s): add S2S text dialog via /ws/realtime-text + event 501 ChatTextQuery
Dual-channel S2S architecture with full isolation between voice and text links:

Backend (Java):
- VolcRealtimeProtocol: add createChatTextQueryMessage (event 501)
- VoiceSessionState: add textMode / playAudioReply / disableGreeting fields
- VoiceWebSocketConfig: register second path /ws/realtime-text (same handler)
- VoiceWebSocketHandler: detect text mode from URL path
- VoiceGatewayService:
  * afterConnectionEstablished: overload with textMode flag
  * handleStart: parse playAudioReply / disableGreeting from client
  * buildStartSessionPayload: inject input_mod=text for text mode
  * handleDirectText: text mode sends event 501 directly, skip processReply
  * handleBinaryMessage: reject client audio in text mode
  * handleUpstreamBinary: drop S2S audio if text mode + no playback
  * startAudioKeepalive: skip entirely in text mode (no audio channel)
  * sendGreeting: skip greeting if disableGreeting=true

Frontend (test2 + delivery):
- nativeVoiceService: connect accepts clientMode/playAudioReply/disableGreeting
  * resolveWebSocketUrl accepts wsPath param
  * Text mode: no microphone capture, no playback context (unless playAudioReply)
  * New sendText() method for event 501 payload
  * handleAudioMessage drops audio in text mode without playback
  * Export NativeVoiceService class for multi-instance usage
- ChatPanel (test2): new useS2S / playAudioReply props
  * useS2S=true: creates NativeVoiceService instance, connects to /ws/realtime-text
  * subtitle events drive streaming UI, assistant_pending drives loading state
  * handleSend routes to WebSocket in S2S mode, HTTP/SSE in Coze mode
  * Voice link code path zero-changed

Verification: mvn test VoiceGatewaySmokeTest 20/20 pass, voice link regression-free
2026-04-17 09:33:56 +08:00
User
ff6a63147b fix(voice-gateway): S2S idle timeout + upstream send lock + iOS AudioContext suspended + port 3012→3013
- P0: S2S DialogAudioIdleTimeoutError now notifies client instead of force-closing, sets upstreamReady=false and cancels keepalive
- P0: Reduce audioKeepaliveIntervalMs from 20s to 8s to prevent S2S idle timeout
- P1: Add upstreamSendLock to prevent concurrent IllegalStateException: Send pending
- P1: iOS AudioContext suspended handling - buffer audio chunks and try resume after user interaction
- P1: disconnect() clears pendingAudioChunks and _resuming to prevent memory leak
- Fix: Frontend hardcoded port 3012→3013 in videoApi.js and vite.config.js
- Add complete Java backend source code to git tracking
2026-04-16 19:16:11 +08:00
User
0560db1048 fix: 品牌保护+知识库全量覆盖 - 6层防御解决传销问题 + 30+产品关键词补全 2026-03-17 11:00:09 +08:00
User
5521b673f5 feat: 添加realtime_dialog和realtime_dialog_external_rag_test项目,更新test2项目 2026-03-13 13:06:46 +08:00
User
9dab61345c Update code 2026-03-12 12:47:56 +08:00