-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Feature/mineru improvements #11938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/mineru improvements #11938
Conversation
这个 PR 解决了什么问题?概要本 PR 为 MinerU 文档解析器增加了更多可配置选项,使用户能够更精细地控制 PDF 解析行为,并提升对多语言文档的支持能力。 变更内容后端(deepdoc/parser/mineru_parser.py)新增可配置的解析选项:
新增语言代码映射( 改进了解析器的配置处理逻辑,使这些选项能够在处理流水线中正确传递。 前端(web/)
集成
为什么要这样做?MinerU 是一个强大的文档解析器,但默认设置并不适用于所有文档类型。本 PR 使用户能够:
测试
变更类型
|
|
Hi, @concertdictate Thank you for your contribution. Have you tested all the backends you added here? class MinerUBackend(StrEnum):
PIPELINE = "pipeline" # Traditional multimodel pipeline (default)
VLM_TRANSFORMERS = "vlm-transformers" # Vision-language model using HuggingFace Transformers
VLM_MLX_ENGINE = "vlm-mlx-engine" # Faster, requires Apple Silicon and macOS 13.5+
VLM_VLLM_ENGINE = "vlm-vllm-engine" # Local vLLM engine, requires local GPU
VLM_VLLM_ASYNC_ENGINE = "vlm-vllm-async-engine" # Asynchronous vLLM engine, new in MinerU API
VLM_LMDEPLOY_ENGINE = "vlm-lmdeploy-engine" # LMDeploy engine
VLM_HTTP_CLIENT = "vlm-http-client" # HTTP client for remote vLLM server (CPU only)At the moment, Cheers. |
|
Let me know if anything needs to be fixed; I’m happy to take care of it. |
|
Hi, @concertdictate I tested the code and observed that MinerU is not actually used during file parsing, since it also needs to be configured for files. Or, did I miss anything?
Cheers. P.S. For your information, we will be refactoring MinerU shortly (potentially today). The goal is to move away from a "batteries-included" local deployment model, as maintaining it has become a burden. Instead, we will only maintain the MinerU-API and the Update: I used wrong file to test this feature, that is my bad. However, we still need to offer a place to configure options for files. I will handle it. Thank you! |
That would be very appropriate, thank you for your work. |



我已在下面的评论中用中文重复说明。
What problem does this PR solve?
Summary
This PR enhances the MinerU document parser with additional configuration options, giving users more control over PDF parsing behavior and improving support for multilingual documents.
Changes
Backend (
deepdoc/parser/mineru_parser.py)auto,txt, orocr— allows users to choose the extraction strategyLANGUAGE_TO_MINERU_MAP) to translate RAGFlow language settings to MinerU-compatible language codes for better OCR accuracyFrontend (
web/)MinerUOptionsFormFieldcomponent that conditionally renders when MinerU is selected as the layout recognition engineIntegration
rag/app/naive.pyto forward MinerU options to the parserWhy
MinerU is a powerful document parser, but the default settings don't work well for all document types. This PR allows users to:
Testing
Type of change