This repository is used to generate translations for training data in SFT.
The following prerequisites are required in order to run translations:
The dependency manager for this project is Poetry. This will need to be installed in order to manage the dependencies of the project.
pip3 install poetry Python version 3.12 is used to run the project. One can install multiple versions of python using pyenv. Install pyenv by following the installation instructions.
List current python versions available
pyenv install -lInstall python 3.12 with pyenv:
pyenv install 3.12In order to use the ollama router, you must download Ollama from their website (https://ollama.com/)
Microsoft Azure Translation API is configured to be used as a translation service. View the current supported languages that the API supports.
The alpaca dataset is the main dataset that is used for translation.
Clone the repo:
git clone https://github.com/Llama-Africa/TranslationService.gitrun the following to install project dependencies:
poetry installEnter api key in env file for the Azure TranslateText api. Follow prerequisites to generate api key.
Paste the region and api key in the .env file under AZURE_TRANSLATE_API_KEY and AZURE_TRANSLATE_REGION respectively.
CD into the src folder and run:
python main.py