OCR-PDF Project(AI Vision)

Overview

The OCR-PDF project is designed to extract text from PDF documents using Optical Character Recognition (OCR) techniques. This project leverages various OCR models to provide accurate and efficient text extraction.

Features

Support for multiple OCR models including:
- Gemini
- OpenAI
- Qwen-2.5-vl(locale model)
Ability to convert extracted text into Word documents.
Easy integration with existing workflows.

Compared porformance

after testing the performance of the models, the results are as follows:

Model Name	Accuracy	Speed
Gemini	Low	Fast
OpenAI	Low	Fast
Qwen(Locale Model)	High	Very Slow

(I use nvidia 4080 for testing)

Installation

To install the necessary dependencies, ~~run the following command~~ I'm not giving a requirements.txt file, because different model requires different, so you can install the dependencies by yourself. (You can use conda to create a new environment and install the dependencies, python 3.11 are recommended), note that if you want to use GPU, you need to install the corresponding CUDA version (https://pytorch.org/get-started/locally/).

References

Qwen Full Page OCR

Contributing

Contributions are welcome! Please submit a pull request or open an issue for any enhancements or bug fixes.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
make_word_docx.py		make_word_docx.py
ocr_gemini.py		ocr_gemini.py
ocr_openai.py		ocr_openai.py
ocr_qwen.py		ocr_qwen.py
ocr_table_recognize.py		ocr_table_recognize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OCR-PDF Project(AI Vision)

Overview

Features

Compared porformance

Installation

References

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

deadlyedge/OCR-PDF-qwen2.5

Folders and files

Latest commit

History

Repository files navigation

OCR-PDF Project(AI Vision)

Overview

Features

Compared porformance

Installation

References

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages