Request: Nougat OCR Integration

# Request: Nougat OCR Integration

I suggest adding Nougat OCR into llama.cpp to enable the processing of scientific PDF documents. 
This can act as a first step towards adding multimodal models to this project!

Implementation:
It seems that Nougat is based on standard transformer architecture (like Bart and Swin Transformer) and most of the work would be on figuring out how to add the image processing.

Let me know what you think!
P.S.: Love this repo! I hope to add my own retrieval-pretrained transformer at some point to this repo.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request: Nougat OCR Integration #3294

Request: Nougat OCR Integration

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request: Nougat OCR Integration #3294

Description

Request: Nougat OCR Integration

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions