A REST API for the Language Toolkit providing document processing, translation, transcription, and video creation capabilities.
- Advanced PPTX Translation: Translate PowerPoint presentations with full formatting preservation - fonts, colors, styles, typography
- Text Translation: Translate text files using DeepL API
- Audio Transcription: Convert audio files to text using OpenAI Whisper
- PPTX Conversion: Convert PowerPoint files to PDF or PNG images
- Text-to-Speech: Generate audio from text files using ElevenLabs
- Video Merging: Combine audio and images into videos
- Smart Downloads: Single files download directly, multiple files as ZIP
- Individual File Downloads: Download specific files from multi-file results
- Asynchronous Processing: Handle long-running tasks with progress tracking
- File Size Validation: Automatic validation of upload sizes with configurable limits
- Install API-specific dependencies:
pip install -r api_requirements.txt- Configure API keys in
.envfile (copy from.env.example):
OPENAI_API_KEY=your-openai-api-key
DEEPL_API_KEY=your-deepl-api-key
ELEVENLABS_API_KEY=your-elevenlabs-api-key
CONVERTAPI_SECRET=your-convertapi-secret- Configure authentication in
.envfile:
# Client credentials for OAuth2 authentication
CLIENT_ID=your-client-id
CLIENT_SECRET=your-client-secret
# Or for multiple clients:
# CLIENT_ID_1=first-client-id
# CLIENT_SECRET_1=first-client-secret
# CLIENT_ID_2=second-client-id
# CLIENT_SECRET_2=second-client-secretStart the server:
python api_server.pyOr with uvicorn directly:
uvicorn api_server:app --host 0.0.0.0 --port 8000 --reloadThe API will be available at http://localhost:8000
The API enforces file size limits to prevent resource exhaustion:
| File Type | Default Limit | Environment Variable |
|---|---|---|
| PPTX files | 50MB | MAX_PPTX_SIZE |
| Text files | 10MB | MAX_TEXT_SIZE |
| Audio files | 200MB | MAX_AUDIO_SIZE |
| General files | 100MB | MAX_FILE_SIZE |
Error Response: Files exceeding limits return HTTP 413 (Payload Too Large) with details:
{
"detail": "File 'large.pptx' is too large (75.2MB). Maximum allowed size for pptx files is 50.0MB."
}Configuration: Override limits via environment variables:
export MAX_PPTX_SIZE=104857600 # 100MB in bytes
export MAX_AUDIO_SIZE=524288000 # 500MB in bytesInteractive API documentation is available at:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
GET /- API information and available endpointsGET /health- Health checkGET /tasks- List all active tasksGET /tasks/{task_id}- Get task statusDELETE /tasks/{task_id}- Clean up task and temporary filesGET /download/{task_id}- Download task results
POST /translate/pptx
- Files: Upload PPTX files
- Form Data:
source_lang: Source language code (e.g., "en")target_lang: Target language code (e.g., "fr")
POST /translate/text
- Files: Upload TXT files
- Form Data:
source_lang: Source language codetarget_lang: Target language code
POST /transcribe/audio
- Files: Upload audio files (MP3, WAV, M4A, etc.)
POST /convert/pptx
- Files: Upload PPTX files
- Form Data:
output_format: "pdf" or "png"
POST /tts
- Files: Upload TXT files (must contain voice name in filename)
POST /translate/text_s3
- JSON Body:
input_keys: Array of S3 object keys for the input TXT filesoutput_prefix: (Optional) Destination S3 prefix for translated filessource_lang: Source language code (e.g., "en")target_lang: Target language code (e.g., "fr")
POST /translate/course_s3
- JSON Body:
course_id: Unique identifier for the coursesource_lang: Current language present in S3 foldertarget_langs: Array of target language codes (e.g., ["fr", "it"])output_prefix: (Optional) Root prefix where translated course will be written
POST /translate/pptx_s3
- JSON Body:
input_keys: Array of S3 object keys for the input PPTX filesoutput_prefix: (Optional) Destination S3 prefix for translated filessource_lang: Source language code (e.g., "en")target_lang: Target language code (e.g., "fr")
POST /transcribe/audio_s3
- JSON Body:
input_keys: Array of S3 object keys for the input audio filesoutput_prefix: (Optional) Destination S3 prefix for transcription results
- Translate a PPTX file:
curl -X POST "http://localhost:8000/translate/pptx" \
-H "Authorization: Bearer token_admin_abc123def456" \
-F "source_lang=en" \
-F "target_lang=fr" \
-F "files=@presentation.pptx"- Check task status:
curl -H "Authorization: Bearer token_admin_abc123def456" \
"http://localhost:8000/tasks/{task_id}"- Download results:
# Download all results (single file directly, multiple files as ZIP)
curl -H "Authorization: Bearer token_admin_abc123def456" \
-O "http://localhost:8000/download/{task_id}"
# Download specific file by index (0-based)
curl -H "Authorization: Bearer token_admin_abc123def456" \
-O "http://localhost:8000/download/{task_id}/0"- Translate a TXT file stored in S3:
curl -X POST "http://localhost:8000/translate/text_s3" \
-H "Authorization: Bearer token_admin_abc123def456" \
-H "Content-Type: application/json" \
-d '{
"input_keys": ["bucket/folder/document.txt"],
"output_prefix": "translated/",
"source_lang": "en",
"target_lang": "fr"
}'- Translate a PPTX stored in S3:
curl -X POST "http://localhost:8000/translate/pptx_s3" \
-H "Authorization: Bearer token_admin_abc123def456" \
-H "Content-Type: application/json" \
-d '{
"input_keys": ["bucket/folder/presentation.pptx"],
"output_prefix": "translated/",
"source_lang": "en",
"target_lang": "fr"
}'- Translate an entire course from S3:
curl -X POST "http://localhost:8000/translate/course_s3" \
-H "Authorization: Bearer token_admin_abc123def456" \
-H "Content-Type: application/json" \
-d '{
"course_id": "cad798e6-3acf-11f0-b82c-771d758cf407",
"source_lang": "en",
"target_langs": ["fr", "it"],
"output_prefix": "translated/"
}'- Transcribe an audio file stored in S3:
curl -X POST "http://localhost:8000/transcribe/audio_s3" \
-H "Authorization: Bearer token_admin_abc123def456" \
-H "Content-Type: application/json" \
-d '{
"input_keys": ["bucket/folder/lecture.mp3"],
"output_prefix": "transcripts/"
}'import requests
# Setup authentication
headers = {'Authorization': 'Bearer token_admin_abc123def456'}
# Upload file for translation
files = {'files': open('presentation.pptx', 'rb')}
data = {'source_lang': 'en', 'target_lang': 'fr'}
response = requests.post(
'http://localhost:8000/translate/pptx',
files=files,
data=data,
headers=headers
)
task_id = response.json()['task_id']
# Check status
status_response = requests.get(
f'http://localhost:8000/tasks/{task_id}',
headers=headers
)
print(status_response.json())
# Download when complete
if status_response.json()['status'] == 'completed':
download_response = requests.get(
f'http://localhost:8000/download/{task_id}',
headers=headers
)
# Save with proper extension based on Content-Type
content_type = download_response.headers.get('content-type', '')
if 'presentation' in content_type:
filename = 'translated_presentation.pptx'
elif 'application/zip' in content_type:
filename = 'results.zip'
else:
filename = 'result.file'
with open(filename, 'wb') as f:
f.write(download_response.content)The API provides professional-grade PPTX translation that preserves all formatting:
- Fonts: Names, sizes, styles maintained
- Colors: RGB and theme colors preserved
- Typography: Bold, italic, underline styles
- Layout: Paragraph spacing, alignment, indentation
- Structure: Text frames, runs, paragraph levels
The API uses the same advanced translation engine as the desktop application, ensuring identical results between interfaces.
- Maintains original presentation design
- Preserves corporate branding and styling
- Ready for professional use without reformatting
The API uses asynchronous task processing:
- Submit a processing request → Get a
task_id - Poll the task status using the
task_id - Download results when status is "completed"
- Clean up the task when done
pending: Task queued but not startedrunning: Task currently processingcompleted: Task finished successfullyfailed: Task encountered an error
The API returns standard HTTP status codes:
200: Success400: Bad Request (invalid parameters)404: Not Found (task or file not found)422: Validation Error500: Internal Server Error
Error responses include details:
{
"detail": "Error description"
}For production deployment:
- Authentication: Add API key authentication
- Rate Limiting: Implement request rate limiting
- File Validation: Enhanced file type and size validation
- HTTPS: Use HTTPS in production
- Resource Limits: Set memory and processing limits
- Monitoring: Add logging and monitoring
Create a Dockerfile:
FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install -r api_requirements.txt
EXPOSE 8000
CMD ["uvicorn", "api_server:app", "--host", "0.0.0.0", "--port", "8000"]For production, consider:
- Gunicorn with uvicorn workers
- Nginx as reverse proxy
- Docker containerization
- Load balancing for multiple instances
- Database for task persistence
- Redis for task queues
Example with Gunicorn:
gunicorn api_server:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000- PPTX Translation: .pptx files
- Text Translation: .txt files
- Audio Transcription: .wav, .mp3, .m4a, .webm, .mp4, .mpga, .mpeg
- PPTX Conversion: .pptx files → PDF/PNG
- Text-to-Speech: .txt files (with voice name in filename)
Current implementation limits:
- File Size: 25MB per file (adjustable)
- Concurrent Tasks: Limited by server resources
- Audio Length: 20MB per audio file (API limitation)
For issues and questions:
- Check the interactive API documentation at
/docs - Review the logs for detailed error information
- Ensure all required API keys are configured