CSE Final Year Research and Development Project
Steps to start the development
.env file
FastAPI
Streamlit
Usage for each functions
Notes on Translation and Transliteration
- Clone the repository
- Create and activate a virtual environment
python3 -m venv .venv
# or
python -m venv .venv
.venv\Scripts\activate- Install the requirements
pip install -r requirements.txt- Create a branch from development branch
git checkout development
git pull origin development
git checkout -b <name of the dev>/dev/<feature>- Create a
.envfile in the root directory - Add the following environment variables
BASE_API_URL="http://127.0.0.1:8000"- Run the FastAPI server
# you may need to,
pip install "fastapi[standard]"
fastapi dev .\server\server.py
# The default port is 8000
# if you want to specify the port
fastapi dev .\server\server.py --port 8000- Run the Streamlit server
streamlit run .\ui\main.py
# The default port is 8501
# if you want to specify the port
streamlit run .\ui\main.py --server.port 8989- Import the required function
from iyal_quality_analyzer import convert_legacy_to_unicode
from iyal_quality_analyzer import classify_unicode
from iyal_quality_analyzer import transliterate
from iyal_quality_analyzer import translate_english_to_tamil
from iyal_quality_analyzer import is_english_word
from iyal_quality_analyzer import Inference
from iyal_quality_analyzer import quality_analyzer
from iyal_quality_analyzer import single_word_quality_analyzer
from iyal_quality_analyzer import multi_sentence_quality_analyzer
from iyal_quality_analyzer import get_encoding_fun- Usage
# convert_legacy_to_unicode
convert_legacy_to_unicode("mfuk;", "bamini2utf2") # returns "அகரம்"
classify_unicode("mfuk;") # returns "Legacy Font Encoding"
transliterate("akaram") # returns "அகரம்"
translate_english_to_tamil("Hello") # returns "வணக்கம்"
is_english_word("Hello") # returns True
# Inference
inference = Inference()
# quality_analyzer
quality_analyzer(Inference, "Kaalai வணkkam உலகம் cyfk;", "bamini2utf2") # returns "காலை வணக்கம் உலகம் உலகம்" AND also it will return an array of objects
single_word_quality_analyzer(Inference, "வணkkam", "bamini2utf2") # returns "வணக்கம்" AND also it will return an object
multi_sentence_quality_analyzer(Inference, "Kaalai வணkkam உலகம் cyfk;. இரவு வணkkam உலகம் cyfk;", "bamini2utf2") # returns "காலை வணக்கம் உலகம் உலகம். இரவு வணக்கம் உலகம் உலகம்" AND also it will return an array of sentence objects
get_encoding_fun("Kaalai வணkkam உலகம் cyfk;.") # Automatically detects the encoding and returns it.- The Inference class is used to load the model
- Make sure you have internet connection before using the Inference class
- The model can be created with a custom path specified for storing the model. If a custom path is not provided, the model will be saved in the default directory, which is the absolute path of the directory where the
Inferenceclass is located, appended with /models/{model_version}.
inference = Inference(cache_dir="path/to/model_dir")- Also you can give custom model name and model version
inference = Inference(model_name="model_name", model_version="model_version")- The
Inferenceclass initializes by checking the cache directory for the model. If the model is not found in the cache, it automatically downloads the model from the server.
- We have use google APIs for translation and transliteration.
- Transliteration: https://github.com/narVidhai/Google-Transliterate-API/blob/master/Languages.md
- This is not Google’s official library since Google has deprecated Input Tools API.
- Translation: https://github.com/ssut/py-googletrans
- This is an unofficial library using the web API of translate.google.com and also is not associated with Google.