User e145f1d97e feat(s2s-text): dedicated text-mode prompt + Markdown rendering
Architecture fix: voice and text mode now have completely separate prompts.

Backend:
- VoiceAssistantProfileSupport.buildTextSystemRole: dedicated text-mode system
  role that inherits all business rules (identity, KB-first, sensitive topics,
  sales guidance, personal info) but removes voice-specific constraints (short
  sentences, colloquial, single-line conclusion).
- DEFAULT_TEXT_SPEAKING_STYLE: text-specific style demanding detailed,
  structured, Markdown-formatted answers with complete information.
- VoiceGatewayService.handleStart: switch between voice/text system role and
  speaking style based on state.textMode.
- VoiceGatewayService.buildStartSessionPayload: preserve Markdown in text mode
  (voice mode still strips asterisks/backticks via normalizeTextForSpeech to
  avoid TTS pronouncing format chars).

Frontend:
- Added react-markdown@9 + remark-gfm@4 dependencies.
- ChatPanel renders assistant messages (non-voice) with ReactMarkdown:
  headings, lists (ul/ol), bold, italic, inline/block code, tables, blockquote,
  links, horizontal rules — all styled with Tailwind classes matching the dark
  theme.
- User messages and voice-handoff messages remain plain text.

Verification: mvn test VoiceGatewaySmokeTest 20/20 pass, vite build succeeds.
2026-04-17 10:10:20 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00
2026-03-12 12:47:56 +08:00

AI 知识库文档智能分块工具

将多种格式文档解析为文本,通过 DeepSeek API 进行语义级智能分块,输出为 Markdown 文件。

支持格式

PDF、Word (.docx)、Excel (.xlsx/.xls)、CSV、HTML、TXT/MD、图片 (PNG/JPG/BMP/GIF/WEBP)

安装

cd ai-knowledge-splitter
pip install -r requirements.txt

使用

python main.py <输入文件> -k <DeepSeek API Key> [-o 输出路径] [-d 分隔符]

示例:

# 基本用法(输出为同名 .md 文件)
python main.py report.pdf -k sk-xxxxxxxx

# 指定输出路径
python main.py data.docx -k sk-xxxxxxxx -o output/result.md

# 自定义分隔符
python main.py notes.txt -k sk-xxxxxxxx -d "==="

参数说明

参数 必需 说明
input_file 输入文件路径
-k, --api-key DeepSeek API Key
-o, --output 输出文件路径(默认:同名 .md
-d, --delimiter 分块分隔符(默认:---

运行测试

cd ai-knowledge-splitter
pytest tests/ -v
Description
No description provided
Readme 41 MiB
Languages
Python 100%