Skip to content

Conversation

@VarunGumma
Copy link

Integrated UrduHack and indic_nlp_resources directly into the module. This negates the need to install the TensorFlow-based Urdu library which was causing some conflicts. Also, the resources are directly added to this module and we do not need to separately clone it and set the path. This will help in easy installation, and packaging, especially for IT2 HF tokenizer.

@VarunGumma
Copy link
Author

Hi @anoopkunchukuttan , as discussed I have opened a PR for the indicnlp version we have been using for IT2 and its tokenizer. This repo integrates UrduHack, indic_nlp_resources and is debloated to support the primary requirements of IT2.Hope this can added directly as another branch to the original repo.

@anoopkunchukuttan
Copy link
Owner

Thanks @VarunGumma , will review and get back in a couple of days

shreypandey and others added 3 commits April 30, 2025 17:42
* Migrated to poetry

* Added gh workflow

---------

Co-authored-by: Shrey Pandey <[email protected]>
* Migrated to poetry

* Added gh workflow

* Updated library name

---------

Co-authored-by: Shrey Pandey <[email protected]>
* dev branch initial commit

* Update README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants