Feature Request: Add OCR Backend Support for Local Document Processing

## Summary
Propose adding an OCR (Optical Character Recognition) backend to enable local document text extraction capabilities within Docker Model Runner.

## Motivation
- Expand Docker Model Runner beyond text generation to include vision/document processing
- Enable privacy-focused local OCR without cloud dependencies
- Leverage existing model distribution and scheduling infrastructure

## Proposed Implementation
1. Create new OCR backend following existing patterns in `pkg/inference/backends/`
2. Integrate with popular document AI, e.g., layoutLMv3, Donut, and et cetera
3. Support common image formats (PNG, JPEG, PDF)
4. Expose OCR functionality through OpenAI-compatible API endpoints

## Technical Considerations
- Follow existing backend interface in `pkg/inference/backends/llamacpp/llamacpp.go`
- Leverage model distribution system for OCR model downloads
- Integrate with resource management for memory allocation
- Support both CPU and GPU acceleration where available

## Questions for Maintainers
- Preferred document AI models?
- API endpoint design preferences?
- Model packaging/distribution strategy?

## Comment
I would be very grateful in my work if I could easily test document AI; OCR locally!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Add OCR Backend Support for Local Document Processing #64

Summary

Motivation

Proposed Implementation

Technical Considerations

Questions for Maintainers

Comment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Add OCR Backend Support for Local Document Processing #64

Description

Summary

Motivation

Proposed Implementation

Technical Considerations

Questions for Maintainers

Comment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions