[QUESTION] Getting tools/preprocess_data.py to work is painful #974
Unanswered
sambar1729
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Your question
Can
tools/preprocess_data.pybe simplified?Using
Right now, it requires nltk, torch, transformer_engine, as well as apex.
Installing transformer_engine does not work out of the box -- had to install out of box (on a A100).
Installing apex has similar problems, when using https://github.com/NVIDIA/apex?tab=readme-ov-file#linux
Given that the repo does not have some sample
idx,binfiles, one would expect thepreprocess_dataprocess to be relatively simple. Could this process be simplified?Installing apex
gives
Beta Was this translation helpful? Give feedback.
All reactions