This commit is contained in:
2026-03-29 11:07:30 +08:00
parent 136ddc270c
commit 140dd3ca35
26 changed files with 4 additions and 2837 deletions

1
.gitignore vendored
View File

@@ -7,3 +7,4 @@
**/*.difypkg **/*.difypkg
urbanLifeServ/* urbanLifeServ/*
*/.data */.data
docs

Submodule ai-management-dify updated: 9fffb6e421...0de13a3495

Submodule ai-management-platform updated: 6bbe3e4181...085ef040ae

View File

@@ -1,146 +0,0 @@
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dify.ai/llms.txt
> Use this file to discover all available pages before exploring further.
# CLI
> Dify 插件开发命令行界面
<Note> ⚠️ 本文档由 AI 自动翻译。如有任何不准确之处,请参考[英文原版](/en/develop-plugin/getting-started/cli)。</Note>
使用命令行界面CLI设置和打包你的 Dify 插件。CLI 提供了一种简化的方式来管理你的插件开发工作流,从初始化到打包。
本指南将指导你如何使用 CLI 进行 Dify 插件开发。
## 前提条件
在开始之前,请确保已安装以下内容:
* Python 版本 ≥ 3.12
* Dify CLI
* Homebrew适用于 Mac 用户)
## 创建 Dify 插件项目
<Tabs>
<Tab title="Mac">
```bash theme={null}
brew tap langgenius/dify
brew install dify
```
</Tab>
<Tab title="Linux">
从 [Dify GitHub 发布页面](https://github.com/langgenius/dify-plugin-daemon/releases) 获取最新的 Dify CLI
```bash theme={null}
# Download dify-plugin-darwin-arm64
chmod +x dify-plugin-darwin-arm64
mv dify-plugin-darwin-arm64 dify
sudo mv dify /usr/local/bin/
```
</Tab>
</Tabs>
现在你已成功安装 Dify CLI。你可以通过运行以下命令来验证安装
```bash theme={null}
dify version
```
你可以使用以下命令创建一个新的 Dify 插件项目:
```bash theme={null}
dify plugin init
```
根据提示填写必填字段:
```bash theme={null}
Edit profile of the plugin
Plugin name (press Enter to next step): hello-world
Author (press Enter to next step): langgenius
Description (press Enter to next step): hello world example
Repository URL (Optional) (press Enter to next step): Repository URL (Optional)
Enable multilingual README: [✔] English is required by default
Languages to generate:
English: [✔] (required)
→ 简体中文 (Simplified Chinese): [✔]
日本語 (Japanese): [✘]
Português (Portuguese - Brazil): [✘]
Controls:
↑/↓ Navigate • Space/Tab Toggle selection • Enter Next step
```
选择 `python` 并按 Enter 继续使用 Python 插件模板。
```bash theme={null}
Select the type of plugin you want to create, and press `Enter` to continue
Before starting, here's some basic knowledge about Plugin types in Dify:
- Tool: Tool Providers like Google Search, Stable Diffusion, etc. Used to perform specific tasks.
- Model: Model Providers like OpenAI, Anthropic, etc. Use their models to enhance AI capabilities.
- Endpoint: Similar to Service API in Dify and Ingress in Kubernetes. Extend HTTP services as endpoints with custom logi
- Agent Strategy: Implement your own agent strategies like Function Calling, ReAct, ToT, CoT, etc.
Based on the ability you want to extend, Plugins are divided into four types: Tool, Model, Extension, and Agent Strategy
- Tool: A tool provider that can also implement endpoints. For example, building a Discord Bot requires both Sending and
- Model: Strictly for model providers, no other extensions allowed.
- Extension: For simple HTTP services that extend functionality.
- Agent Strategy: Implement custom agent logic with a focused approach.
We've provided templates to help you get started. Choose one of the options below:
-> tool
agent-strategy
llm
text-embedding
rerank
tts
speech2text
moderation
extension
```
输入默认的 dify 版本,留空则使用最新版本:
```bash theme={null}
Edit minimal Dify version requirement, leave it blank by default
Minimal Dify version (press Enter to next step):
```
现在你已准备就绪CLI 将创建一个以你提供的插件名称命名的新目录,并为你的插件设置基本结构。
```bash theme={null}
cd hello-world
```
## 运行插件
确保你在 hello-world 目录中
```bash theme={null}
cp .env.example .env
```
编辑 `.env` 文件以设置插件的环境变量,例如 API 密钥或其他配置。你可以在 Dify 仪表板中找到这些变量。登录到你的 Dify 环境,点击右上角的"插件"图标,然后点击调试图标(或类似虫子的图标)。在弹出窗口中,复制"API Key"和"Host Address"。(请参考你本地对应的截图,其中显示了获取密钥和主机地址的界面)
```bash theme={null}
INSTALL_METHOD=remote
REMOTE_INSTALL_HOST=debug-plugin.dify.dev
REMOTE_INSTALL_PORT=5003
REMOTE_INSTALL_KEY=********-****-****-****-************
```
现在你可以使用以下命令在本地运行你的插件:
```bash theme={null}
pip install -r requirements.txt
python -m main
```
***
[编辑此页面](https://github.com/langgenius/dify-docs/edit/main/en/develop-plugin/getting-started/cli.mdx) | [报告问题](https://github.com/langgenius/dify-docs/issues/new?template=docs.yml)

View File

@@ -1,184 +0,0 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
.python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
Pipfile.lock
# UV
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
uv.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
# Vscode
.vscode/
# Git
.git/
.gitignore
.github/
# Mac
.DS_Store
# Windows
Thumbs.db
# Dify plugin packages
# To prevent packaging repetitively
*.difypkg

View File

@@ -1,3 +0,0 @@
INSTALL_METHOD=remote
REMOTE_INSTALL_URL=debug.dify.ai:5003
REMOTE_INSTALL_KEY=********-****-****-****-************

View File

@@ -1,109 +0,0 @@
name: Plugin Publish Workflow
on:
release:
types: [published]
jobs:
publish:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Download CLI tool
run: |
mkdir -p $RUNNER_TEMP/bin
cd $RUNNER_TEMP/bin
wget https://github.com/langgenius/dify-plugin-daemon/releases/download/0.0.6/dify-plugin-linux-amd64
chmod +x dify-plugin-linux-amd64
echo "CLI tool location:"
pwd
ls -la dify-plugin-linux-amd64
- name: Get basic info from manifest
id: get_basic_info
run: |
PLUGIN_NAME=$(grep "^name:" manifest.yaml | cut -d' ' -f2)
echo "Plugin name: $PLUGIN_NAME"
echo "plugin_name=$PLUGIN_NAME" >> $GITHUB_OUTPUT
VERSION=$(grep "^version:" manifest.yaml | cut -d' ' -f2)
echo "Plugin version: $VERSION"
echo "version=$VERSION" >> $GITHUB_OUTPUT
# If the author's name is not your github username, you can change the author here
AUTHOR=$(grep "^author:" manifest.yaml | cut -d' ' -f2)
echo "Plugin author: $AUTHOR"
echo "author=$AUTHOR" >> $GITHUB_OUTPUT
- name: Package Plugin
id: package
run: |
cd $GITHUB_WORKSPACE
PACKAGE_NAME="${{ steps.get_basic_info.outputs.plugin_name }}-${{ steps.get_basic_info.outputs.version }}.difypkg"
$RUNNER_TEMP/bin/dify-plugin-linux-amd64 plugin package . -o "$PACKAGE_NAME"
echo "Package result:"
ls -la "$PACKAGE_NAME"
echo "package_name=$PACKAGE_NAME" >> $GITHUB_OUTPUT
echo "\nFull file path:"
pwd
echo "\nDirectory structure:"
tree || ls -R
- name: Checkout target repo
uses: actions/checkout@v3
with:
repository: ${{steps.get_basic_info.outputs.author}}/dify-plugins
path: dify-plugins
token: ${{ secrets.PLUGIN_ACTION }}
fetch-depth: 1
persist-credentials: true
- name: Prepare and create PR
run: |
PACKAGE_NAME="${{ steps.get_basic_info.outputs.plugin_name }}-${{ steps.get_basic_info.outputs.version }}.difypkg"
mkdir -p dify-plugins/${{ steps.get_basic_info.outputs.author }}/${{ steps.get_basic_info.outputs.plugin_name }}
mv "$PACKAGE_NAME" dify-plugins/${{ steps.get_basic_info.outputs.author }}/${{ steps.get_basic_info.outputs.plugin_name }}/
cd dify-plugins
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
git fetch origin main
git checkout main
git pull origin main
BRANCH_NAME="bump-${{ steps.get_basic_info.outputs.plugin_name }}-plugin-${{ steps.get_basic_info.outputs.version }}"
git checkout -b "$BRANCH_NAME"
git add .
git commit -m "bump ${{ steps.get_basic_info.outputs.plugin_name }} plugin to version ${{ steps.get_basic_info.outputs.version }}"
git push -u origin "$BRANCH_NAME" --force
git branch -a
echo "Waiting for branch to sync..."
sleep 10 # Wait 10 seconds for branch sync
- name: Create PR via GitHub API
env:
# How to config the token:
# 1. Profile -> Settings -> Developer settings -> Personal access tokens -> Generate new token (with repo scope) -> Copy the token
# 2. Go to the target repository -> Settings -> Secrets and variables -> Actions -> New repository secret -> Add the token as PLUGIN_ACTION
GH_TOKEN: ${{ secrets.PLUGIN_ACTION }}
run: |
gh pr create \
--repo langgenius/dify-plugins \
--head "${{ steps.get_basic_info.outputs.author }}:${{ steps.get_basic_info.outputs.plugin_name }}-${{ steps.get_basic_info.outputs.version }}" \
--base main \
--title "bump ${{ steps.get_basic_info.outputs.plugin_name }} plugin to version ${{ steps.get_basic_info.outputs.version }}" \
--body "bump ${{ steps.get_basic_info.outputs.plugin_name }} plugin package to version ${{ steps.get_basic_info.outputs.version }}
Changes:
- Updated plugin package file" || echo "PR already exists or creation skipped." # Handle cases where PR already exists

View File

@@ -1,176 +0,0 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# UV
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
#uv.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
# Vscode
.vscode/
# macOS
.DS_Store
.AppleDouble
.LSOverride

View File

@@ -1,137 +0,0 @@
# Dify Plugin Development Guide
Welcome to Dify plugin development! This guide will help you get started quickly.
## Plugin Types
Dify plugins extend three main capabilities:
| Type | Description | Example |
|------|-------------|---------|
| **Tool** | Perform specific tasks | Google Search, Stable Diffusion |
| **Model** | AI model integrations | OpenAI, Anthropic |
| **Endpoint** | HTTP services | Custom APIs, integrations |
You can create:
- **Tool**: Tool provider with optional endpoints (e.g., Discord bot)
- **Model**: Model provider only
- **Extension**: Simple HTTP service
## Setup
### Requirements
- Python 3.11+
- Dependencies: `pip install -r requirements.txt`
## Development Process
<details>
<summary><b>1. Manifest Structure</b></summary>
Edit `manifest.yaml` to describe your plugin:
```yaml
version: 0.1.0 # Required: Plugin version
type: plugin # Required: plugin or bundle
author: YourOrganization # Required: Organization name
label: # Required: Multi-language names
en_US: Plugin Name
zh_Hans: 插件名称
created_at: 2023-01-01T00:00:00Z # Required: Creation time (RFC3339)
icon: assets/icon.png # Required: Icon path
# Resources and permissions
resource:
memory: 268435456 # Max memory (bytes)
permission:
tool:
enabled: true # Tool permission
model:
enabled: true # Model permission
llm: true
text_embedding: false
# Other model types...
# Other permissions...
# Extensions definition
plugins:
tools:
- tools/my_tool.yaml # Tool definition files
models:
- models/my_model.yaml # Model definition files
endpoints:
- endpoints/my_api.yaml # Endpoint definition files
# Runtime metadata
meta:
version: 0.0.1 # Manifest format version
arch:
- amd64
- arm64
runner:
language: python
version: "3.12"
entrypoint: main
```
**Restrictions:**
- Cannot extend both tools and models
- Must have at least one extension
- Cannot extend both models and endpoints
- Limited to one supplier per extension type
</details>
<details>
<summary><b>2. Implementation Examples</b></summary>
Study these examples to understand plugin implementation:
- [OpenAI](https://github.com/langgenius/dify-plugin-sdks/tree/main/python/examples/openai) - Model provider
- [Google Search](https://github.com/langgenius/dify-plugin-sdks/tree/main/python/examples/google) - Tool provider
- [Neko](https://github.com/langgenius/dify-plugin-sdks/tree/main/python/examples/neko) - Endpoint group
</details>
<details>
<summary><b>3. Testing & Debugging</b></summary>
1. Copy `.env.example` to `.env` and configure:
```
INSTALL_METHOD=remote
REMOTE_INSTALL_URL=debug.dify.ai:5003
REMOTE_INSTALL_KEY=your-debug-key
```
2. Run your plugin:
```bash
python -m main
```
3. Refresh your Dify instance to see the plugin (marked as "debugging")
</details>
<details>
<summary><b>4. Publishing</b></summary>
#### Manual Packaging
```bash
dify-plugin plugin package ./YOUR_PLUGIN_DIR
```
#### Automated GitHub Workflow
Configure GitHub Actions to automate PR creation:
1. Create a Personal Access Token for your forked repository
2. Add it as `PLUGIN_ACTION` secret in your source repo
3. Create `.github/workflows/plugin-publish.yml`
When you create a release, the action will:
- Package your plugin
- Create a PR to your fork
[Detailed workflow documentation](https://docs.dify.ai/plugins/publish-plugins/plugin-auto-publish-pr)
</details>
## Privacy Policy
If publishing to the Marketplace, provide a privacy policy in [PRIVACY.md](PRIVACY.md).

View File

@@ -1,3 +0,0 @@
## Privacy
!!! Please fill in the privacy policy of the plugin.

View File

@@ -1,10 +0,0 @@
## pdf
**Author:** yslg
**Version:** 0.0.1
**Type:** tool
### Description

View File

@@ -1,55 +0,0 @@
<!--
~ Dify Marketplace Template Icon
~ Dify 市场模板图标
~ Dify マーケットプレイステンプレートアイコン
~
~ WARNING / 警告 / 警告:
~
~ English: This is a TEMPLATE icon from Dify Marketplace only. You MUST NOT use this default icon in any way.
~ Please replace it with your own custom icon before submit this plugin.
~
~ 中文: 这只是来自 Dify 市场的模板图标。您绝对不能以任何方式使用此默认图标。
~ 请在提交此插件之前将其替换为您自己的自定义图标。
~
~ 日本語: これは Dify マーケットプレイスのテンプレートアイコンです。このデフォルトアイコンをいかなる方法でも使用してはいけません。
~ このプラグインを提出する前に、独自のカスタムアイコンに置き換えてください。
~
~ DIFY_MARKETPLACE_TEMPLATE_ICON_DO_NOT_USE
-->
<svg width="40" height="40" viewBox="0 0 40 40" fill="none" xmlns="http://www.w3.org/2000/svg">
<g clip-path="url(#clip0_15253_95095)">
<rect width="40" height="40" fill="#0033FF"/>
<g filter="url(#filter0_n_15253_95095)">
<rect width="40" height="40" fill="url(#paint0_linear_15253_95095)"/>
</g>
<path d="M28 10C28.5523 10 29 10.4477 29 11V16C29 16.5523 28.5523 17 28 17H23V30C23 30.5523 22.5523 31 22 31H18C17.4477 31 17 30.5523 17 30V17H11.5C10.9477 17 10.5 16.5523 10.5 16V13.618C10.5 13.2393 10.714 12.893 11.0528 12.7236L16.5 10H28ZM23 12H16.9721L12.5 14.2361V15H19V29H21V15H23V12ZM27 12H25V15H27V12Z" fill="white"/>
</g>
<defs>
<filter id="filter0_n_15253_95095" x="0" y="0" width="40" height="40" filterUnits="userSpaceOnUse" color-interpolation-filters="sRGB">
<feFlood flood-opacity="0" result="BackgroundImageFix"/>
<feBlend mode="normal" in="SourceGraphic" in2="BackgroundImageFix" result="shape"/>
<feTurbulence type="fractalNoise" baseFrequency="2 2" stitchTiles="stitch" numOctaves="3" result="noise" seed="8033" />
<feComponentTransfer in="noise" result="coloredNoise1">
<feFuncR type="linear" slope="2" intercept="-0.5" />
<feFuncG type="linear" slope="2" intercept="-0.5" />
<feFuncB type="linear" slope="2" intercept="-0.5" />
<feFuncA type="discrete" tableValues="1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 "/>
</feComponentTransfer>
<feComposite operator="in" in2="shape" in="coloredNoise1" result="noise1Clipped" />
<feComponentTransfer in="noise1Clipped" result="color1">
<feFuncA type="table" tableValues="0 0.06" />
</feComponentTransfer>
<feMerge result="effect1_noise_15253_95095">
<feMergeNode in="shape" />
<feMergeNode in="color1" />
</feMerge>
</filter>
<linearGradient id="paint0_linear_15253_95095" x1="0" y1="0" x2="40" y2="40" gradientUnits="userSpaceOnUse">
<stop stop-color="#1443FF"/>
<stop offset="1" stop-color="#0031F5"/>
</linearGradient>
<clipPath id="clip0_15253_95095">
<rect width="40" height="40" fill="white"/>
</clipPath>
</defs>
</svg>

Before

Width:  |  Height:  |  Size: 3.0 KiB

View File

@@ -1,55 +0,0 @@
<!--
~ Dify Marketplace Template Icon
~ Dify 市场模板图标
~ Dify マーケットプレイステンプレートアイコン
~
~ WARNING / 警告 / 警告:
~
~ English: This is a TEMPLATE icon from Dify Marketplace only. You MUST NOT use this default icon in any way.
~ Please replace it with your own custom icon before submit this plugin.
~
~ 中文: 这只是来自 Dify 市场的模板图标。您绝对不能以任何方式使用此默认图标。
~ 请在提交此插件之前将其替换为您自己的自定义图标。
~
~ 日本語: これは Dify マーケットプレイスのテンプレートアイコンです。このデフォルトアイコンをいかなる方法でも使用してはいけません。
~ このプラグインを提出する前に、独自のカスタムアイコンに置き換えてください。
~
~ DIFY_MARKETPLACE_TEMPLATE_ICON_DO_NOT_USE
-->
<svg width="40" height="40" viewBox="0 0 40 40" fill="none" xmlns="http://www.w3.org/2000/svg">
<g clip-path="url(#clip0_15255_46435)">
<rect width="40" height="40" fill="#0033FF"/>
<g filter="url(#filter0_n_15255_46435)">
<rect width="40" height="40" fill="url(#paint0_linear_15255_46435)"/>
</g>
<path d="M28 10C28.5523 10 29 10.4477 29 11V16C29 16.5523 28.5523 17 28 17H23V30C23 30.5523 22.5523 31 22 31H18C17.4477 31 17 30.5523 17 30V17H11.5C10.9477 17 10.5 16.5523 10.5 16V13.618C10.5 13.2393 10.714 12.893 11.0528 12.7236L16.5 10H28ZM23 12H16.9721L12.5 14.2361V15H19V29H21V15H23V12ZM27 12H25V15H27V12Z" fill="white"/>
</g>
<defs>
<filter id="filter0_n_15255_46435" x="0" y="0" width="40" height="40" filterUnits="userSpaceOnUse" color-interpolation-filters="sRGB">
<feFlood flood-opacity="0" result="BackgroundImageFix"/>
<feBlend mode="normal" in="SourceGraphic" in2="BackgroundImageFix" result="shape"/>
<feTurbulence type="fractalNoise" baseFrequency="2 2" stitchTiles="stitch" numOctaves="3" result="noise" seed="8033" />
<feComponentTransfer in="noise" result="coloredNoise1">
<feFuncR type="linear" slope="2" intercept="-0.5" />
<feFuncG type="linear" slope="2" intercept="-0.5" />
<feFuncB type="linear" slope="2" intercept="-0.5" />
<feFuncA type="discrete" tableValues="1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 "/>
</feComponentTransfer>
<feComposite operator="in" in2="shape" in="coloredNoise1" result="noise1Clipped" />
<feComponentTransfer in="noise1Clipped" result="color1">
<feFuncA type="table" tableValues="0 0.06" />
</feComponentTransfer>
<feMerge result="effect1_noise_15255_46435">
<feMergeNode in="shape" />
<feMergeNode in="color1" />
</feMerge>
</filter>
<linearGradient id="paint0_linear_15255_46435" x1="0" y1="0" x2="40" y2="40" gradientUnits="userSpaceOnUse">
<stop stop-color="#1F4CFF"/>
<stop offset="1" stop-color="#0033FF"/>
</linearGradient>
<clipPath id="clip0_15255_46435">
<rect width="40" height="40" fill="white"/>
</clipPath>
</defs>
</svg>

Before

Width:  |  Height:  |  Size: 3.0 KiB

View File

@@ -1,6 +0,0 @@
from dify_plugin import Plugin, DifyPluginEnv
plugin = Plugin(DifyPluginEnv(MAX_REQUEST_TIMEOUT=120))
if __name__ == '__main__':
plugin.run()

View File

@@ -1,40 +0,0 @@
version: 0.0.1
type: plugin
author: yslg
name: pdf
label:
en_US: pdf
ja_JP: pdf
zh_Hans: pdf
pt_BR: pdf
description:
en_US: pdfTools
ja_JP: pdfTools
zh_Hans: pdfTools
pt_BR: pdfTools
icon: icon.svg
icon_dark: icon-dark.svg
resource:
memory: 268435456
permission:
tool:
enabled: true
model:
enabled: true
llm: true
plugins:
tools:
- provider/pdf.yaml
meta:
version: 0.0.1
arch:
- amd64
- arm64
runner:
language: python
version: "3.12"
entrypoint: main
minimum_dify_version: null
created_at: 2026-03-02T13:21:03.2806864+08:00
privacy: PRIVACY.md
verified: false

View File

@@ -1,64 +0,0 @@
{
"name": "pdf-plugin",
"version": "1.0.0",
"description": "PDF plugin for analyzing table of contents and extracting text",
"author": "System",
"type": "tool",
"main": "main.py",
"requirements": "requirements.txt",
"icon": "https://neeko-copilot.bytedance.net/api/text2image?prompt=PDF%20document%20icon&size=square",
"settings": [
{
"key": "debug",
"type": "boolean",
"default": false,
"description": "Enable debug mode"
}
],
"functions": [
{
"name": "analyze_toc",
"description": "Analyze PDF and find table of contents",
"parameters": {
"type": "object",
"properties": {
"file": {
"type": "file",
"description": "PDF file to analyze",
"fileTypes": ["pdf"]
}
},
"required": ["file"]
}
},
{
"name": "extract_text",
"description": "Extract text from specified page range",
"parameters": {
"type": "object",
"properties": {
"file": {
"type": "file",
"description": "PDF file to extract text from",
"fileTypes": ["pdf"]
},
"page_range": {
"type": "object",
"properties": {
"start": {
"type": "integer",
"default": 0,
"description": "Start page index"
},
"end": {
"type": "integer",
"description": "End page index"
}
}
}
},
"required": ["file"]
}
}
]
}

View File

@@ -1,53 +0,0 @@
from typing import Any
from dify_plugin import ToolProvider
from dify_plugin.errors.tool import ToolProviderCredentialValidationError
class PdfProvider(ToolProvider):
def _validate_credentials(self, credentials: dict[str, Any]) -> None:
try:
"""
IMPLEMENT YOUR VALIDATION HERE
"""
except Exception as e:
raise ToolProviderCredentialValidationError(str(e))
#########################################################################################
# If OAuth is supported, uncomment the following functions.
# Warning: please make sure that the sdk version is 0.4.2 or higher.
#########################################################################################
# def _oauth_get_authorization_url(self, redirect_uri: str, system_credentials: Mapping[str, Any]) -> str:
# """
# Generate the authorization URL for pdf OAuth.
# """
# try:
# """
# IMPLEMENT YOUR AUTHORIZATION URL GENERATION HERE
# """
# except Exception as e:
# raise ToolProviderOAuthError(str(e))
# return ""
# def _oauth_get_credentials(
# self, redirect_uri: str, system_credentials: Mapping[str, Any], request: Request
# ) -> Mapping[str, Any]:
# """
# Exchange code for access_token.
# """
# try:
# """
# IMPLEMENT YOUR CREDENTIALS EXCHANGE HERE
# """
# except Exception as e:
# raise ToolProviderOAuthError(str(e))
# return dict()
# def _oauth_refresh_credentials(
# self, redirect_uri: str, system_credentials: Mapping[str, Any], credentials: Mapping[str, Any]
# ) -> OAuthCredentials:
# """
# Refresh the credentials
# """
# return OAuthCredentials(credentials=credentials, expires_at=-1)

View File

@@ -1,21 +0,0 @@
identity:
author: "yslg"
name: "pdf"
label:
en_US: "pdf"
zh_Hans: "pdf"
pt_BR: "pdf"
ja_JP: "pdf"
description:
en_US: "pdfTools"
zh_Hans: "pdfTools"
pt_BR: "pdfTools"
ja_JP: "pdfTools"
icon: "icon.svg"
tools:
- tools/pdf_toc.yaml
- tools/pdf_to_markdown.yaml
extra:
python:
source: provider/pdf.py

View File

@@ -1,2 +0,0 @@
dify_plugin>=0.4.0,<0.7.0
pymupdf>=1.27.1

View File

@@ -1,234 +0,0 @@
import json
import re
from collections.abc import Generator
from typing import Any
import fitz
from dify_plugin import Tool
from dify_plugin.entities.tool import ToolInvokeMessage
class PdfToMarkdownTool(Tool):
"""Convert PDF to Markdown using an external catalog array."""
def _invoke(self, tool_parameters: dict[str, Any]) -> Generator[ToolInvokeMessage]:
file = tool_parameters.get("file")
catalog_text = (tool_parameters.get("catalog") or "").strip()
if not file:
yield self.create_text_message("Error: file is required")
return
if not catalog_text:
yield self.create_text_message("Error: catalog is required")
return
catalog = self._parse_catalog(catalog_text)
if not catalog:
yield self.create_text_message("Error: catalog must be a JSON array with title and page indexes")
return
doc = fitz.open(stream=file.blob, filetype="pdf")
try:
num_pages = len(doc)
hf_texts = self._detect_headers_footers(doc, num_pages)
page_mds = [self._page_to_markdown(doc[index], hf_texts) for index in range(num_pages)]
final_md = self._assemble_by_catalog(catalog, page_mds, num_pages)
yield self.create_text_message(final_md)
yield self.create_blob_message(
blob=final_md.encode("utf-8"),
meta={"mime_type": "text/markdown"},
)
finally:
doc.close()
def _parse_catalog(self, catalog_text: str) -> list[dict[str, Any]]:
try:
raw = json.loads(catalog_text)
except Exception:
return []
if not isinstance(raw, list):
return []
result: list[dict[str, Any]] = []
for item in raw:
if not isinstance(item, dict):
continue
title = str(item.get("title") or "").strip() or "Untitled"
start_index = self._to_int(item.get("page_start_index"), None)
end_index = self._to_int(item.get("page_end_index"), start_index)
if start_index is None:
start = self._to_int(item.get("start"), None)
end = self._to_int(item.get("end"), start)
if start is None:
continue
start_index = max(0, start - 1)
end_index = max(start_index, (end if end is not None else start) - 1)
if end_index is None:
end_index = start_index
result.append(
{
"title": title,
"page_start_index": max(0, start_index),
"page_end_index": max(start_index, end_index),
}
)
return result
def _detect_headers_footers(self, doc: fitz.Document, num_pages: int) -> set[str]:
margin_ratio = 0.08
sample_count = min(num_pages, 30)
text_counts: dict[str, int] = {}
for idx in range(sample_count):
page = doc[idx]
page_height = page.rect.height
top_limit = page_height * margin_ratio
bottom_limit = page_height * (1 - margin_ratio)
try:
blocks = page.get_text("blocks", sort=True) or []
except Exception:
continue
seen: set[str] = set()
for block in blocks:
if len(block) < 7 or block[6] != 0:
continue
y0, y1 = block[1], block[3]
text = (block[4] or "").strip()
if not text or len(text) < 2 or text in seen:
continue
if y1 <= top_limit or y0 >= bottom_limit:
seen.add(text)
text_counts[text] = text_counts.get(text, 0) + 1
threshold = max(3, sample_count * 0.35)
return {text for text, count in text_counts.items() if count >= threshold}
def _page_to_markdown(self, page: fitz.Page, hf_texts: set[str]) -> str:
parts: list[str] = []
page_height = page.rect.height
top_margin = page_height * 0.06
bottom_margin = page_height * 0.94
table_rects: list[fitz.Rect] = []
table_mds: list[str] = []
try:
find_tables = getattr(page, "find_tables", None)
tables = []
if callable(find_tables):
table_finder = find_tables()
tables = getattr(table_finder, "tables", []) or []
for table in tables[:5]:
try:
table_rects.append(fitz.Rect(table.bbox))
except Exception:
pass
cells = table.extract() or []
if len(cells) < 2:
continue
if hf_texts and len(cells) <= 3:
flat = " ".join(str(cell or "") for row in cells for cell in row)
if any(hf in flat for hf in hf_texts):
continue
md_table = self._cells_to_md_table(cells)
if md_table:
table_mds.append(md_table)
except Exception:
pass
try:
blocks = page.get_text("blocks", sort=True) or []
except Exception:
blocks = []
for block in blocks:
if len(block) < 7 or block[6] != 0:
continue
x0, y0, x1, y1 = block[:4]
text = (block[4] or "").strip()
if not text:
continue
block_rect = fitz.Rect(x0, y0, x1, y1)
if any(self._rects_overlap(block_rect, table_rect) for table_rect in table_rects):
continue
if hf_texts and (y1 <= top_margin or y0 >= bottom_margin):
if any(hf in text for hf in hf_texts):
continue
if re.fullmatch(r"\s*\d{1,4}\s*", text):
continue
parts.append(text)
parts.extend(table_mds)
return "\n\n".join(parts)
def _assemble_by_catalog(self, catalog: list[dict[str, Any]], page_mds: list[str], num_pages: int) -> str:
parts: list[str] = []
used_pages: set[int] = set()
for item in catalog:
start = max(0, min(int(item["page_start_index"]), num_pages - 1))
end = max(start, min(int(item["page_end_index"]), num_pages - 1))
chapter_parts = [f"# {item['title']}\n"]
for idx in range(start, end + 1):
if idx < len(page_mds) and page_mds[idx].strip() and idx not in used_pages:
chapter_parts.append(page_mds[idx])
used_pages.add(idx)
if len(chapter_parts) > 1:
parts.append("\n\n".join(chapter_parts))
if parts:
return "\n\n---\n\n".join(parts)
return "\n\n---\n\n".join(m for m in page_mds if m.strip())
@staticmethod
def _rects_overlap(block_rect: fitz.Rect, table_rect: fitz.Rect) -> bool:
inter = block_rect & table_rect
if inter.is_empty:
return False
block_area = block_rect.width * block_rect.height
if block_area <= 0:
return False
return (inter.width * inter.height) / block_area >= 0.3
@staticmethod
def _cells_to_md_table(cells: list) -> str:
if not cells:
return ""
header = cells[0]
ncols = len(header)
if ncols == 0:
return ""
def clean(value: Any) -> str:
return str(value or "").replace("|", "\\|").replace("\n", " ").strip()
lines = [
"| " + " | ".join(clean(cell) for cell in header) + " |",
"| " + " | ".join("---" for _ in range(ncols)) + " |",
]
for row in cells[1:]:
padded = list(row) + [""] * max(0, ncols - len(row))
lines.append("| " + " | ".join(clean(cell) for cell in padded[:ncols]) + " |")
return "\n".join(lines)
@staticmethod
def _to_int(value: Any, default: int | None) -> int | None:
try:
if value is None or value == "":
return default
return int(value)
except Exception:
return default

View File

@@ -1,51 +0,0 @@
identity:
name: "pdf_to_markdown"
author: "yslg"
label:
en_US: "PDF to Markdown"
zh_Hans: "PDF to Markdown"
pt_BR: "PDF para Markdown"
ja_JP: "PDF to Markdown"
description:
human:
en_US: "Convert PDF to Markdown using a catalog array. Images and graphics are ignored."
zh_Hans: "Convert PDF to Markdown using a catalog array. Images and graphics are ignored."
pt_BR: "Convert PDF to Markdown using a catalog array. Images and graphics are ignored."
ja_JP: "Convert PDF to Markdown using a catalog array. Images and graphics are ignored."
llm: "Convert a PDF file into Markdown using a catalog JSON array. Ignore images and graphics."
parameters:
- name: file
type: file
required: true
label:
en_US: PDF File
zh_Hans: PDF File
pt_BR: PDF File
ja_JP: PDF File
human_description:
en_US: "PDF file to convert"
zh_Hans: "PDF file to convert"
pt_BR: "PDF file to convert"
ja_JP: "PDF file to convert"
llm_description: "PDF file to convert to Markdown"
form: llm
fileTypes:
- "pdf"
- name: catalog
type: string
required: true
label:
en_US: Catalog JSON
zh_Hans: Catalog JSON
pt_BR: Catalog JSON
ja_JP: Catalog JSON
human_description:
en_US: "Catalog JSON array like [{title,start,end,page_start_index,page_end_index}]"
zh_Hans: "Catalog JSON array like [{title,start,end,page_start_index,page_end_index}]"
pt_BR: "Catalog JSON array like [{title,start,end,page_start_index,page_end_index}]"
ja_JP: "Catalog JSON array like [{title,start,end,page_start_index,page_end_index}]"
llm_description: "Catalog JSON array returned by pdf_toc"
form: llm
extra:
python:
source: tools/pdf_to_markdown.py

View File

@@ -1,312 +0,0 @@
import json
import re
from collections import OrderedDict
from collections.abc import Generator
from typing import Any
import fitz
from dify_plugin import Tool
from dify_plugin.entities.model.llm import LLMModelConfig
from dify_plugin.entities.model.message import SystemPromptMessage, UserPromptMessage
from dify_plugin.entities.tool import ToolInvokeMessage
_TOC_SYSTEM_PROMPT = """你是专业的PDF目录解析助手。请从以下PDF文本中提取文档的目录/章节结构。
要求:
1. 识别所有一级和二级标题及其对应的页码
2. 只返回纯JSON数组不要markdown代码块不要任何解释
3. 格式: [{"title": "章节标题", "page": 页码数字}]
4. 页码必须是文档中标注的实际页码数字
5. 如果无法识别目录,返回空数组 []"""
class PdfTocTool(Tool):
_TOC_PATTERNS = [
r"目录",
r"\s*录",
r"\u3000录",
r"Table of Contents",
r"Contents",
r"目次",
]
def _invoke(self, tool_parameters: dict[str, Any]) -> Generator[ToolInvokeMessage]:
file = tool_parameters.get("file")
if not file:
yield self.create_text_message("Error: file is required")
return
model_config = tool_parameters.get("model")
doc = fitz.open(stream=file.blob, filetype="pdf")
try:
num_pages = len(doc)
# 1) 优先从PDF元数据提取目录
catalog = self._catalog_from_metadata(doc.get_toc(), num_pages)
# 2) 元数据无目录时使用LLM解析
if not catalog and model_config:
catalog = self._extract_toc_with_llm(doc, num_pages, model_config)
# 3) 无LLM配置时回退到正则解析
if not catalog:
toc_start, toc_end = self._find_toc_pages(doc, num_pages)
if toc_start is not None and toc_end is not None:
toc_text = "\n".join(
doc[index].get_text() or "" for index in range(toc_start, toc_end + 1)
)
printed_catalog = self._parse_toc_lines(toc_text)
catalog = self._attach_page_indexes(printed_catalog, toc_end, num_pages)
if not catalog:
catalog = []
yield self.create_text_message(json.dumps(catalog, ensure_ascii=False))
finally:
doc.close()
def _extract_toc_with_llm(
self, doc: fitz.Document, num_pages: int, model_config: dict[str, Any]
) -> list[dict[str, int | str]]:
# 先尝试定位目录页
toc_start, toc_end = self._find_toc_pages(doc, num_pages)
if toc_start is not None and toc_end is not None:
# 有目录页,提取目录页文本
toc_text = "\n".join(
doc[index].get_text() or "" for index in range(toc_start, toc_end + 1)
)
content_offset = toc_end
else:
# 无目录页提取前15页文本让LLM识别章节结构
sample = min(num_pages, 15)
toc_text = "\n\n--- 第{}页 ---\n".join(
[""] + [doc[i].get_text() or "" for i in range(sample)]
)
toc_text = toc_text.strip()
if not toc_text:
return []
content_offset = 0
# 截断过长文本
if len(toc_text) > 15000:
toc_text = toc_text[:15000] + "\n...[截断]"
try:
response = self.session.model.llm.invoke(
model_config=LLMModelConfig(**model_config),
prompt_messages=[
SystemPromptMessage(content=_TOC_SYSTEM_PROMPT),
UserPromptMessage(content=toc_text),
],
stream=False,
)
llm_text = self._get_response_text(response)
if not llm_text:
return []
raw_catalog = self._parse_llm_json(llm_text)
if not raw_catalog:
return []
# 转换LLM返回的简单格式为完整catalog
return self._build_catalog_from_llm(raw_catalog, content_offset, num_pages)
except Exception:
return []
def _build_catalog_from_llm(
self, raw: list[dict], content_offset: int, num_pages: int
) -> list[dict[str, int | str]]:
entries: list[tuple[str, int]] = []
for item in raw:
title = str(item.get("title") or "").strip()
page = self._to_int(item.get("page"), None)
if not title or page is None:
continue
entries.append((title, page))
if not entries:
return []
# 计算偏移量:第一个条目的页码与实际内容起始页的差值
first_printed_page = entries[0][1]
offset = (content_offset + 1) - first_printed_page if content_offset > 0 else 0
result: list[dict[str, int | str]] = []
for i, (title, page) in enumerate(entries):
next_page = entries[i + 1][1] if i + 1 < len(entries) else page
page_start_index = max(0, min(page + offset - 1, num_pages - 1))
page_end_index = max(page_start_index, min(next_page + offset - 2, num_pages - 1))
if i == len(entries) - 1:
page_end_index = num_pages - 1
result.append({
"title": title,
"start": page,
"end": max(page, next_page - 1) if i + 1 < len(entries) else page,
"page_start_index": page_start_index,
"page_end_index": page_end_index,
})
return result
@staticmethod
def _get_response_text(response: Any) -> str:
if not hasattr(response, "message") or not response.message:
return ""
content = response.message.content
if isinstance(content, str):
text = content
elif isinstance(content, list):
text = "".join(
item.data if hasattr(item, "data") else str(item) for item in content
)
else:
text = str(content)
# 清理思考标签
text = re.sub(r"<think>[\s\S]*?</think>", "", text, flags=re.IGNORECASE)
text = re.sub(r"<\|[^>]+\|>", "", text)
return text.strip()
@staticmethod
def _parse_llm_json(text: str) -> list[dict]:
# 尝试提取JSON代码块
code_match = re.search(r"```(?:json)?\s*([\s\S]*?)```", text)
if code_match:
text = code_match.group(1).strip()
# 尝试找到JSON数组
bracket_match = re.search(r"\[[\s\S]*\]", text)
if bracket_match:
text = bracket_match.group(0)
try:
result = json.loads(text)
if isinstance(result, list):
return result
except Exception:
pass
return []
def _catalog_from_metadata(self, toc: list, num_pages: int) -> list[dict[str, int | str]]:
top = [(title, max(0, page - 1)) for level, title, page in toc if level <= 2 and page >= 1]
if not top:
return []
result: list[dict[str, int | str]] = []
for index, (title, start_index) in enumerate(top):
end_index = top[index + 1][1] - 1 if index + 1 < len(top) else num_pages - 1
result.append({
"title": title,
"start": start_index + 1,
"end": max(start_index, end_index) + 1,
"page_start_index": start_index,
"page_end_index": max(start_index, end_index),
})
return result
def _find_toc_pages(self, doc: fitz.Document, num_pages: int) -> tuple[int | None, int | None]:
toc_start = None
toc_end = None
for page_number in range(min(num_pages, 30)):
text = doc[page_number].get_text() or ""
if any(re.search(pattern, text, re.IGNORECASE) for pattern in self._TOC_PATTERNS):
if toc_start is None:
toc_start = page_number
toc_end = page_number
elif toc_start is not None:
break
return toc_start, toc_end
def _parse_toc_lines(self, text: str) -> list[dict[str, int | str]]:
marker = re.search(
r"^(List\s+of\s+Figures|List\s+of\s+Tables|图目录|表目录)",
text,
re.IGNORECASE | re.MULTILINE,
)
if marker:
text = text[: marker.start()]
pattern = re.compile(r"^\s*(?P<title>.+?)\s*(?:\.{2,}|\s)\s*(?P<page>\d{1,5})\s*$")
entries: list[tuple[str, int]] = []
for raw in text.splitlines():
line = raw.strip()
if not line or len(line) < 3 or re.fullmatch(r"\d+", line):
continue
match = pattern.match(line)
if not match:
continue
title = re.sub(r"\s+", " ", match.group("title")).strip("-_:")
page = self._to_int(match.group("page"), None)
if not title or page is None or len(title) <= 1:
continue
if title.lower() in {"page", "pages", "目录", "contents"}:
continue
entries.append((title, page))
if not entries:
return []
dedup: OrderedDict[str, int] = OrderedDict()
for title, page in entries:
dedup.setdefault(title, page)
titles = list(dedup.keys())
pages = [dedup[title] for title in titles]
result: list[dict[str, int | str]] = []
for index, title in enumerate(titles):
start = pages[index]
end = max(start, pages[index + 1] - 1) if index + 1 < len(pages) else start
result.append({"title": title, "start": start, "end": end})
return result
def _attach_page_indexes(
self, catalog: list[dict[str, int | str]], toc_end: int, num_pages: int
) -> list[dict[str, int | str]]:
if not catalog:
return []
first_page = None
for item in catalog:
start = self._to_int(item.get("start"), None)
if start is not None and (first_page is None or start < first_page):
first_page = start
if first_page is None:
return []
offset = (toc_end + 1) - first_page
result: list[dict[str, int | str]] = []
for item in catalog:
start = self._to_int(item.get("start"), None)
end = self._to_int(item.get("end"), start)
if start is None:
continue
if end is None:
end = start
page_start_index = max(0, min(start + offset, num_pages - 1))
page_end_index = max(page_start_index, min(end + offset, num_pages - 1))
result.append({
"title": str(item.get("title") or "Untitled"),
"start": start,
"end": max(start, end),
"page_start_index": page_start_index,
"page_end_index": page_end_index,
})
return result
@staticmethod
def _to_int(value: Any, default: int | None) -> int | None:
try:
if value is None or value == "":
return default
return int(value)
except Exception:
return default

View File

@@ -1,51 +0,0 @@
identity:
name: "pdf_toc"
author: "yslg"
label:
en_US: "PDF TOC"
zh_Hans: "PDF 目录提取"
pt_BR: "PDF TOC"
ja_JP: "PDF TOC"
description:
human:
en_US: "Extract the catalog array from a PDF file using metadata or LLM."
zh_Hans: "从PDF文件中提取目录数组优先使用元数据回退使用LLM解析。"
pt_BR: "Extrair o array de catálogo de um arquivo PDF."
ja_JP: "PDFファイルからカタログ配列を抽出する。"
llm: "Extract a catalog array from a PDF file. Returns JSON text like [{title,start,end,page_start_index,page_end_index}]."
parameters:
- name: file
type: file
required: true
label:
en_US: PDF File
zh_Hans: PDF 文件
pt_BR: PDF File
ja_JP: PDF File
human_description:
en_US: "PDF file to inspect"
zh_Hans: "要解析的PDF文件"
pt_BR: "PDF file to inspect"
ja_JP: "PDF file to inspect"
llm_description: "PDF file to extract catalog from"
form: llm
fileTypes:
- "pdf"
- name: model
type: model-selector
scope: llm
required: true
label:
en_US: LLM Model
zh_Hans: LLM 模型
pt_BR: Modelo LLM
ja_JP: LLMモデル
human_description:
en_US: "LLM model used for parsing TOC when metadata is unavailable"
zh_Hans: "当元数据不可用时用于解析目录的LLM模型"
pt_BR: "Modelo LLM para análise de TOC"
ja_JP: "メタデータが利用できない場合のTOC解析用LLMモデル"
form: form
extra:
python:
source: tools/pdf_toc.py

File diff suppressed because it is too large Load Diff

View File

@@ -1,122 +0,0 @@
# Dify 插件服务需求文档
## 1. 项目概述
开发一个基于 FastAPI 框架的 Dify 插件服务,实现与 Dify 平台的集成,支持多种插件的部署和管理,提供各种功能扩展。
## 2. 技术栈
- **框架**FastAPI
- **语言**Python 3.9+
- **依赖管理**Poetry 或 Pip
- **部署方式**Docker 容器化
## 3. 项目架构
### 3.1 架构设计
- **插件管理系统**:统一管理多个 Dify 插件
- **插件加载机制**:支持动态加载和热更新插件
- **插件隔离**:每个插件运行在独立的环境中
- **API 网关**:统一的 API 入口,路由到对应插件
### 3.2 目录结构
```
difyPlugin/
├── main.py # 应用入口
├── requirements.txt # 依赖管理
├── .env # 环境配置
├── app/
│ ├── api/ # API 路由
│ ├── core/ # 核心配置
│ ├── plugins/ # 插件目录
│ │ ├── plugin1/ # 插件1
│ │ ├── plugin2/ # 插件2
│ │ └── __init__.py # 插件加载器
│ └── services/ # 公共服务
└── tests/ # 测试目录
```
### 3.3 插件规范
- **插件结构**:每个插件包含独立的配置、逻辑和 API
- **插件接口**:统一的插件接口规范
- **插件注册**:自动发现和注册插件
- **插件生命周期**:支持插件的启动、停止和重启
## 4. 核心功能
### 4.1 基础功能
- **健康检查**:提供服务状态检查接口
- **版本管理**:支持插件版本控制
- **认证机制**:实现与 Dify 的安全认证
- **插件管理**:支持插件的注册、启动、停止和卸载
### 4.2 业务功能
- **数据处理**:支持各种数据格式的转换和处理
- **外部 API 集成**:对接第三方服务的 API
- **自定义逻辑**:支持用户自定义业务逻辑
- **事件处理**:响应 Dify 平台的事件触发
## 5. 接口设计
### 5.1 主要接口
- `GET /health`:健康检查
- `GET /api/v1/plugins`:获取插件列表
- `GET /api/v1/plugins/{plugin_id}`:获取插件详情
- `POST /api/v1/plugins/{plugin_id}/execute`:执行插件功能
- `GET /api/v1/plugins/{plugin_id}/metadata`:获取插件元数据
- `POST /api/v1/plugins/{plugin_id}/start`:启动插件
- `POST /api/v1/plugins/{plugin_id}/stop`:停止插件
### 5.2 请求/响应格式
- **请求格式**JSON
- **响应格式**JSON包含状态码和数据
## 6. 部署要求
- **环境变量**:支持通过环境变量配置服务参数
- **日志管理**:集成结构化日志
- **监控指标**:提供 Prometheus 指标接口
- **错误处理**:完善的错误处理和异常捕获
- **插件隔离**:支持插件的独立部署和隔离
## 7. 集成方式
- **Dify 插件注册**:按照 Dify 插件规范注册
- **Webhook 配置**:支持 Dify 平台的 Webhook 回调
- **事件订阅**:订阅 Dify 平台的事件
- **插件发现**:自动发现和注册新插件
## 8. 开发计划
### 8.1 阶段一:项目初始化
- 创建 FastAPI 项目结构
- 配置依赖管理
- 实现插件管理系统
### 8.2 阶段二:核心功能开发
- 实现插件加载机制
- 开发插件接口规范
- 实现数据处理功能
- 集成外部 API
### 8.3 阶段三:测试与部署
- 编写单元测试
- 集成测试
- 容器化部署
- 插件示例开发
## 9. 技术要求
- **代码质量**:遵循 PEP 8 编码规范
- **文档**:完善的 API 文档
- **性能**:优化响应速度和资源占用
- **安全**:实现安全的认证和授权机制
- **可扩展性**:支持插件的动态添加和移除
## 10. 交付物
- **源代码**:完整的项目代码
- **部署文档**:详细的部署步骤
- **API 文档**:自动生成的 API 文档
- **测试报告**:测试结果和覆盖率报告
- **插件开发指南**:插件开发和注册指南

Binary file not shown.