184 lines
6.0 KiB
Markdown
184 lines
6.0 KiB
Markdown
|
|
# FC 回调知识库语音播放修复方案
|
|||
|
|
|
|||
|
|
## 问题描述
|
|||
|
|
|
|||
|
|
用户通过语音通话提问 → LLM 触发 `search_knowledge` 工具 → FC 回调执行知识库查询 → **结果无法通过 S2S 语音播放给用户**。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 根因分析
|
|||
|
|
|
|||
|
|
### 根因 1:ExternalTextToSpeech 200 字符限制
|
|||
|
|
|
|||
|
|
**官方文档明确规定**([自定义语音播放](https://www.volcengine.com/docs/6348/1449206)):
|
|||
|
|
|
|||
|
|
> Message: 要播报的文本内容,**长度不超过 200 个字符**。
|
|||
|
|
|
|||
|
|
知识库返回内容通常 500~2000 字符,远超此限制,导致 API **静默拒绝或截断**。
|
|||
|
|
|
|||
|
|
### 根因 2:Command:"function" 在混合模式下不可靠
|
|||
|
|
|
|||
|
|
在 S2S+LLM 混合模式(`OutputMode=1`)下:
|
|||
|
|
- `Command:"function"` 将工具结果返回给 LLM 处理
|
|||
|
|
- 但 LLM 润色后的回复**可能不通过 S2S 管道播放**
|
|||
|
|
- LLM 认为工具未返回结果,触发**无限重试**(日志中同一问题出现 3 个不同 `call_id`)
|
|||
|
|
|
|||
|
|
### 根因 3:TaskId 不匹配
|
|||
|
|
|
|||
|
|
- FC 回调中的 `TaskID` 是 RTC 内部 UUID(如 `f6c8cddf-...`)
|
|||
|
|
- `StartVoiceChat` 的 `TaskId` 是自定义格式(如 `task_xxx_timestamp`)
|
|||
|
|
- 导致 `UpdateVoiceChat` 命令发送到错误的 Task
|
|||
|
|
|
|||
|
|
### 根因 4:延迟瓶颈
|
|||
|
|
|
|||
|
|
原始串行流程耗时约 18 秒:
|
|||
|
|
```
|
|||
|
|
1s chunk收集 → 0.5s interrupt → 0.5s 安抚语 → 15s KB查询 → 1s TTS = ~18s
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 修复方案
|
|||
|
|
|
|||
|
|
### 修复 1:分段 TTS 播放(解决 200 字符限制)
|
|||
|
|
|
|||
|
|
**文件**: `server/routes/voice.js`
|
|||
|
|
|
|||
|
|
将 KB 结果按自然断句拆分为 ≤200 字符的段落,逐段通过 `ExternalTextToSpeech` 播放:
|
|||
|
|
|
|||
|
|
```javascript
|
|||
|
|
// 分段函数:在句号、问号、感叹号等自然断点处拆分
|
|||
|
|
const MAX_TTS_LEN = 200; // 官方限制
|
|||
|
|
const MAX_TOTAL_LEN = 800; // 总内容上限,避免播放过久
|
|||
|
|
|
|||
|
|
const splitForTTS = (text, maxLen) => {
|
|||
|
|
const segments = [];
|
|||
|
|
let remaining = text;
|
|||
|
|
while (remaining.length > 0) {
|
|||
|
|
if (remaining.length <= maxLen) { segments.push(remaining); break; }
|
|||
|
|
let cutAt = -1;
|
|||
|
|
const breakChars = ['。', '!', '?', ';', '\n', ',', '、'];
|
|||
|
|
for (const ch of breakChars) {
|
|||
|
|
const idx = remaining.lastIndexOf(ch, maxLen - 1);
|
|||
|
|
if (idx > cutAt) cutAt = idx;
|
|||
|
|
}
|
|||
|
|
if (cutAt <= 0) cutAt = maxLen;
|
|||
|
|
else cutAt += 1;
|
|||
|
|
segments.push(remaining.substring(0, cutAt));
|
|||
|
|
remaining = remaining.substring(cutAt).trim();
|
|||
|
|
}
|
|||
|
|
return segments.filter(s => s.length > 0);
|
|||
|
|
};
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
播放策略:
|
|||
|
|
- **第一段** `InterruptMode: 1`(高优先级,打断安抚语)
|
|||
|
|
- **后续段** `InterruptMode: 2`(中优先级,排队播放)
|
|||
|
|
|
|||
|
|
### 修复 2:Command:function 异步通知 LLM(解决无限重试)
|
|||
|
|
|
|||
|
|
在 `ExternalTextToSpeech` 播放后,**异步**发送 `Command:"function"` 让 LLM 知道工具已返回结果,停止重试:
|
|||
|
|
|
|||
|
|
```javascript
|
|||
|
|
if (b.id) {
|
|||
|
|
volcengine.updateVoiceChat({
|
|||
|
|
Command: 'function',
|
|||
|
|
Message: JSON.stringify({ ToolCallID: b.id, Content: contentText.substring(0, 2000) }),
|
|||
|
|
}).catch(e => console.warn('Command:function failed (non-critical):', e.message));
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 修复 3:30 秒 Cooldown 防重试(解决 LLM 无限重试)
|
|||
|
|
|
|||
|
|
**文件**: `server/routes/voice.js`
|
|||
|
|
|
|||
|
|
在工具结果发送后,设置 30 秒 cooldown,期间忽略相同 TaskId 的重复调用:
|
|||
|
|
|
|||
|
|
```javascript
|
|||
|
|
const cooldownMs = existing.resultSentAt ? 30000 : 15000;
|
|||
|
|
const elapsed = existing.resultSentAt
|
|||
|
|
? (Date.now() - existing.resultSentAt)
|
|||
|
|
: (Date.now() - existing.createdAt);
|
|||
|
|
if (elapsed < cooldownMs) {
|
|||
|
|
console.log(`Cooldown active, ignoring retry`);
|
|||
|
|
return;
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 修复 4:TaskId 解析优先级(解决 TaskId 不匹配)
|
|||
|
|
|
|||
|
|
使用三级回退策略解析正确的 TaskId:
|
|||
|
|
|
|||
|
|
```javascript
|
|||
|
|
const s2sTaskId = roomToTaskId.get(b.RoomID) || b.S2STaskID || effectiveTaskId;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
- **优先**:`roomToTaskId`(从 StartVoiceChat 响应中捕获的服务端 TaskId)
|
|||
|
|
- **其次**:回调中的 `S2STaskID`
|
|||
|
|
- **兜底**:回调中的原始 `TaskID`
|
|||
|
|
|
|||
|
|
### 修复 5:延迟优化(减少 ~1.5 秒等待)
|
|||
|
|
|
|||
|
|
**文件**: `server/routes/voice.js`
|
|||
|
|
|
|||
|
|
| 优化项 | 修改前 | 修改后 | 节省 |
|
|||
|
|
|--------|--------|--------|------|
|
|||
|
|
| chunk 收集超时 | 1000ms | 500ms | 500ms |
|
|||
|
|
| interrupt 命令 | 单独发送 ~500ms | 移除(InterruptMode:1 已含打断) | 500ms |
|
|||
|
|
| 安抚语 vs KB 查询 | 串行等待 | `Promise.all` 并行 | ~500ms |
|
|||
|
|
|
|||
|
|
优化后流程:
|
|||
|
|
```
|
|||
|
|
0.5s chunk收集 → [安抚语 + KB查询 并行] → 1s TTS分段 = ~16.5s
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
```javascript
|
|||
|
|
// 并行执行:安抚语 + KB 查询同时进行
|
|||
|
|
const waitingPromptPromise = volcengine.updateVoiceChat({
|
|||
|
|
Command: 'ExternalTextToSpeech',
|
|||
|
|
Message: '正在查询知识库,请稍候。',
|
|||
|
|
InterruptMode: 1,
|
|||
|
|
}).catch(err => console.warn('Waiting prompt failed:', err.message));
|
|||
|
|
|
|||
|
|
const kbQueryPromise = ToolExecutor.execute(toolName, parsedArgs);
|
|||
|
|
|
|||
|
|
const [, kbResult] = await Promise.all([waitingPromptPromise, kbQueryPromise]);
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 修复 6:Ark KB 超时缩短
|
|||
|
|
|
|||
|
|
**文件**: `server/services/toolExecutor.js`
|
|||
|
|
|
|||
|
|
```javascript
|
|||
|
|
timeout: 15000, // 从 30s 减到 15s,减少等待
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 修改文件清单
|
|||
|
|
|
|||
|
|
| 文件 | 修改内容 |
|
|||
|
|
|------|---------|
|
|||
|
|
| `server/routes/voice.js` | FC 回调处理:分段 TTS、并行执行、cooldown、TaskId 解析 |
|
|||
|
|
| `server/services/toolExecutor.js` | Ark KB 超时从 30s 减到 15s |
|
|||
|
|
| `server/.env` | FC_SERVER_URL 更新为部署域名 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 关键参考文档
|
|||
|
|
|
|||
|
|
- [自定义语音播放(ExternalTextToSpeech)](https://www.volcengine.com/docs/6348/1449206) — **200 字符限制**
|
|||
|
|
- [Function Calling](https://www.volcengine.com/docs/6348/1554654) — FC 回调机制
|
|||
|
|
- [接入知识库 RAG](https://www.volcengine.com/docs/6348/1557771) — 官方推荐 Coze/MCP 方式
|
|||
|
|
- [UpdateVoiceChat API](https://www.volcengine.com/docs/6348/2011497) — Command 参数说明
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 后续优化方向
|
|||
|
|
|
|||
|
|
如果当前方案的 15s KB 查询延迟仍然不可接受,可考虑:
|
|||
|
|
|
|||
|
|
1. **迁移到 Coze Bot 内置知识库**:`LLMConfig.Mode="CozeBot"`,知识库查询由 Coze 内部完成,减少网络往返
|
|||
|
|
2. **接入 MCP Server**:通过 Viking 知识库 MCP 直接集成
|
|||
|
|
3. **本地知识库缓存**:对高频问题预加载结果,命中缓存时延迟 <1s
|