Language Toolkit API

A REST API for the Language Toolkit providing document processing, translation, transcription, and video creation capabilities.

Features

Advanced PPTX Translation: Translate PowerPoint presentations with full formatting preservation - fonts, colors, styles, typography
Text Translation: Translate text files using DeepL API
Audio Transcription: Convert audio files to text using OpenAI Whisper
PPTX Conversion: Convert PowerPoint files to PDF or PNG images
Text-to-Speech: Generate audio from text files using ElevenLabs
Video Merging: Combine audio and images into videos
Smart Downloads: Single files download directly, multiple files as ZIP
Individual File Downloads: Download specific files from multi-file results
Asynchronous Processing: Handle long-running tasks with progress tracking
File Size Validation: Automatic validation of upload sizes with configurable limits

Installation

Install API-specific dependencies:

pip install -r api_requirements.txt

Configure API keys in .env file (copy from .env.example):

OPENAI_API_KEY=your-openai-api-key
DEEPL_API_KEY=your-deepl-api-key
ELEVENLABS_API_KEY=your-elevenlabs-api-key
CONVERTAPI_SECRET=your-convertapi-secret

Configure authentication in .env file:

# Client credentials for OAuth2 authentication
CLIENT_ID=your-client-id
CLIENT_SECRET=your-client-secret

# Or for multiple clients:
# CLIENT_ID_1=first-client-id
# CLIENT_SECRET_1=first-client-secret
# CLIENT_ID_2=second-client-id
# CLIENT_SECRET_2=second-client-secret

Running the API

Start the server:

python api_server.py

Or with uvicorn directly:

uvicorn api_server:app --host 0.0.0.0 --port 8000 --reload

The API will be available at http://localhost:8000

File Size Limits

The API enforces file size limits to prevent resource exhaustion:

File Type	Default Limit	Environment Variable
PPTX files	50MB	`MAX_PPTX_SIZE`
Text files	10MB	`MAX_TEXT_SIZE`
Audio files	200MB	`MAX_AUDIO_SIZE`
General files	100MB	`MAX_FILE_SIZE`

Error Response: Files exceeding limits return HTTP 413 (Payload Too Large) with details:

{
  "detail": "File 'large.pptx' is too large (75.2MB). Maximum allowed size for pptx files is 50.0MB."
}

Configuration: Override limits via environment variables:

export MAX_PPTX_SIZE=104857600  # 100MB in bytes
export MAX_AUDIO_SIZE=524288000 # 500MB in bytes

API Documentation

Interactive API documentation is available at:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

API Endpoints

Core Endpoints

GET / - API information and available endpoints
GET /health - Health check
GET /tasks - List all active tasks
GET /tasks/{task_id} - Get task status
DELETE /tasks/{task_id} - Clean up task and temporary files
GET /download/{task_id} - Download task results

Processing Endpoints

PPTX Translation

POST /translate/pptx

Files: Upload PPTX files
Form Data:
- source_lang: Source language code (e.g., "en")
- target_lang: Target language code (e.g., "fr")

Text Translation

POST /translate/text

Files: Upload TXT files
Form Data:
- source_lang: Source language code
- target_lang: Target language code

Audio Transcription

POST /transcribe/audio

Files: Upload audio files (MP3, WAV, M4A, etc.)

PPTX Conversion

POST /convert/pptx

Files: Upload PPTX files
Form Data:
- output_format: "pdf" or "png"

Text-to-Speech

POST /tts

Files: Upload TXT files (must contain voice name in filename)

Text Translation from S3

POST /translate/text_s3

JSON Body:
- input_keys: Array of S3 object keys for the input TXT files
- output_prefix: (Optional) Destination S3 prefix for translated files
- source_lang: Source language code (e.g., "en")
- target_lang: Target language code (e.g., "fr")

Course Translation from S3

POST /translate/course_s3

JSON Body:
- course_id: Unique identifier for the course
- source_lang: Current language present in S3 folder
- target_langs: Array of target language codes (e.g., ["fr", "it"])
- output_prefix: (Optional) Root prefix where translated course will be written

PPTX Translation from S3

POST /translate/pptx_s3

JSON Body:
- input_keys: Array of S3 object keys for the input PPTX files
- output_prefix: (Optional) Destination S3 prefix for translated files
- source_lang: Source language code (e.g., "en")
- target_lang: Target language code (e.g., "fr")

Audio Transcription from S3

POST /transcribe/audio_s3

JSON Body:
- input_keys: Array of S3 object keys for the input audio files
- output_prefix: (Optional) Destination S3 prefix for transcription results

Usage Examples

Using curl

Translate a PPTX file:

curl -X POST "http://localhost:8000/translate/pptx" \
  -H "Authorization: Bearer token_admin_abc123def456" \
  -F "source_lang=en" \
  -F "target_lang=fr" \
  -F "files=@presentation.pptx"

Check task status:

curl -H "Authorization: Bearer token_admin_abc123def456" \
  "http://localhost:8000/tasks/{task_id}"

Download results:

# Download all results (single file directly, multiple files as ZIP)
curl -H "Authorization: Bearer token_admin_abc123def456" \
  -O "http://localhost:8000/download/{task_id}"

# Download specific file by index (0-based)
curl -H "Authorization: Bearer token_admin_abc123def456" \
  -O "http://localhost:8000/download/{task_id}/0"

Translate a TXT file stored in S3:

curl -X POST "http://localhost:8000/translate/text_s3" \
  -H "Authorization: Bearer token_admin_abc123def456" \
  -H "Content-Type: application/json" \
  -d '{
        "input_keys": ["bucket/folder/document.txt"],
        "output_prefix": "translated/",
        "source_lang": "en",
        "target_lang": "fr"
      }'

Translate a PPTX stored in S3:

curl -X POST "http://localhost:8000/translate/pptx_s3" \
  -H "Authorization: Bearer token_admin_abc123def456" \
  -H "Content-Type: application/json" \
  -d '{
        "input_keys": ["bucket/folder/presentation.pptx"],
        "output_prefix": "translated/",
        "source_lang": "en",
        "target_lang": "fr"
      }'

Translate an entire course from S3:

curl -X POST "http://localhost:8000/translate/course_s3" \
  -H "Authorization: Bearer token_admin_abc123def456" \
  -H "Content-Type: application/json" \
  -d '{
        "course_id": "cad798e6-3acf-11f0-b82c-771d758cf407",
        "source_lang": "en",
        "target_langs": ["fr", "it"],
        "output_prefix": "translated/"
      }'

Transcribe an audio file stored in S3:

curl -X POST "http://localhost:8000/transcribe/audio_s3" \
  -H "Authorization: Bearer token_admin_abc123def456" \
  -H "Content-Type: application/json" \
  -d '{
        "input_keys": ["bucket/folder/lecture.mp3"],
        "output_prefix": "transcripts/"
      }'

Using Python requests

import requests

# Setup authentication
headers = {'Authorization': 'Bearer token_admin_abc123def456'}

# Upload file for translation
files = {'files': open('presentation.pptx', 'rb')}
data = {'source_lang': 'en', 'target_lang': 'fr'}

response = requests.post(
    'http://localhost:8000/translate/pptx', 
    files=files, 
    data=data,
    headers=headers
)

task_id = response.json()['task_id']

# Check status
status_response = requests.get(
    f'http://localhost:8000/tasks/{task_id}',
    headers=headers
)
print(status_response.json())

# Download when complete
if status_response.json()['status'] == 'completed':
    download_response = requests.get(
        f'http://localhost:8000/download/{task_id}',
        headers=headers
    )
    
    # Save with proper extension based on Content-Type
    content_type = download_response.headers.get('content-type', '')
    if 'presentation' in content_type:
        filename = 'translated_presentation.pptx'
    elif 'application/zip' in content_type:
        filename = 'results.zip'
    else:
        filename = 'result.file'
    
    with open(filename, 'wb') as f:
        f.write(download_response.content)

Advanced PPTX Translation

The API provides professional-grade PPTX translation that preserves all formatting:

✅ Complete Formatting Preservation

Fonts: Names, sizes, styles maintained
Colors: RGB and theme colors preserved
Typography: Bold, italic, underline styles
Layout: Paragraph spacing, alignment, indentation
Structure: Text frames, runs, paragraph levels

🎯 Same Quality as GUI App

The API uses the same advanced translation engine as the desktop application, ensuring identical results between interfaces.

📊 Professional Results

Maintains original presentation design
Preserves corporate branding and styling
Ready for professional use without reformatting

Task Management

The API uses asynchronous task processing:

Submit a processing request → Get a task_id
Poll the task status using the task_id
Download results when status is "completed"
Clean up the task when done

Task Status Values

pending: Task queued but not started
running: Task currently processing
completed: Task finished successfully
failed: Task encountered an error

Error Handling

The API returns standard HTTP status codes:

200: Success
400: Bad Request (invalid parameters)
404: Not Found (task or file not found)
422: Validation Error
500: Internal Server Error

Error responses include details:

{
    "detail": "Error description"
}

Security Considerations

For production deployment:

Authentication: Add API key authentication
Rate Limiting: Implement request rate limiting
File Validation: Enhanced file type and size validation
HTTPS: Use HTTPS in production
Resource Limits: Set memory and processing limits
Monitoring: Add logging and monitoring

Deployment

Docker Deployment

Create a Dockerfile:

FROM python:3.9-slim

WORKDIR /app
COPY . .
RUN pip install -r api_requirements.txt

EXPOSE 8000
CMD ["uvicorn", "api_server:app", "--host", "0.0.0.0", "--port", "8000"]

Production Deployment

For production, consider:

Gunicorn with uvicorn workers
Nginx as reverse proxy
Docker containerization
Load balancing for multiple instances
Database for task persistence
Redis for task queues

Example with Gunicorn:

gunicorn api_server:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000

Supported File Formats

PPTX Translation: .pptx files
Text Translation: .txt files
Audio Transcription: .wav, .mp3, .m4a, .webm, .mp4, .mpga, .mpeg
PPTX Conversion: .pptx files → PDF/PNG
Text-to-Speech: .txt files (with voice name in filename)

API Limits

Current implementation limits:

File Size: 25MB per file (adjustable)
Concurrent Tasks: Limited by server resources
Audio Length: 20MB per audio file (API limitation)

Support

For issues and questions:

Check the interactive API documentation at /docs
Review the logs for detailed error information
Ensure all required API keys are configured

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Language Toolkit API

Features

Installation

Running the API

File Size Limits

API Documentation

API Endpoints

Core Endpoints

Processing Endpoints

PPTX Translation

Text Translation

Audio Transcription

PPTX Conversion

Text-to-Speech

Text Translation from S3

Course Translation from S3

PPTX Translation from S3

Audio Transcription from S3

Usage Examples

Using curl

Using Python requests

Advanced PPTX Translation

✅ Complete Formatting Preservation

🎯 Same Quality as GUI App

📊 Professional Results

Task Management

Task Status Values

Error Handling

Security Considerations

Deployment

Docker Deployment

Production Deployment

Supported File Formats

API Limits

Support

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Language Toolkit API

Features

Installation

Running the API

File Size Limits

API Documentation

API Endpoints

Core Endpoints

Processing Endpoints

PPTX Translation

Text Translation

Audio Transcription

PPTX Conversion

Text-to-Speech

Text Translation from S3

Course Translation from S3

PPTX Translation from S3

Audio Transcription from S3

Usage Examples

Using curl

Using Python requests

Advanced PPTX Translation

✅ Complete Formatting Preservation

🎯 Same Quality as GUI App

📊 Professional Results

Task Management

Task Status Values

Error Handling

Security Considerations

Deployment

Docker Deployment

Production Deployment

Supported File Formats

API Limits

Support