- localStorage-persistent textEngine state ('coze' | 's2s')
- Header button toggles between the two engines when in chat mode
- ChatPanel remounts on engine switch via key=sessionId-textEngine
- Voice mode completely unaffected
Dual-channel S2S architecture with full isolation between voice and text links:
Backend (Java):
- VolcRealtimeProtocol: add createChatTextQueryMessage (event 501)
- VoiceSessionState: add textMode / playAudioReply / disableGreeting fields
- VoiceWebSocketConfig: register second path /ws/realtime-text (same handler)
- VoiceWebSocketHandler: detect text mode from URL path
- VoiceGatewayService:
* afterConnectionEstablished: overload with textMode flag
* handleStart: parse playAudioReply / disableGreeting from client
* buildStartSessionPayload: inject input_mod=text for text mode
* handleDirectText: text mode sends event 501 directly, skip processReply
* handleBinaryMessage: reject client audio in text mode
* handleUpstreamBinary: drop S2S audio if text mode + no playback
* startAudioKeepalive: skip entirely in text mode (no audio channel)
* sendGreeting: skip greeting if disableGreeting=true
Frontend (test2 + delivery):
- nativeVoiceService: connect accepts clientMode/playAudioReply/disableGreeting
* resolveWebSocketUrl accepts wsPath param
* Text mode: no microphone capture, no playback context (unless playAudioReply)
* New sendText() method for event 501 payload
* handleAudioMessage drops audio in text mode without playback
* Export NativeVoiceService class for multi-instance usage
- ChatPanel (test2): new useS2S / playAudioReply props
* useS2S=true: creates NativeVoiceService instance, connects to /ws/realtime-text
* subtitle events drive streaming UI, assistant_pending drives loading state
* handleSend routes to WebSocket in S2S mode, HTTP/SSE in Coze mode
* Voice link code path zero-changed
Verification: mvn test VoiceGatewaySmokeTest 20/20 pass, voice link regression-free