-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Discussion 🗣
Hello all!
Continuing a discussion with @EasonC13 and @niallroche, I am opening this thread to keep track of the different approaches to fine-tune a transformer model (BERT or some variation of it like ALBERT) and its usage with Snorkel.
Context
@EasonC13 already did some work to generate a dataset with multiple NER labels using Keras here: https://github.com/accordproject/labs-cicero-classify/blob/dev/Practice/keras/keras_decompose_NER_model.ipynb
To replicate the above, we can:
- Use pytorch by following the example of huggingface
- Use Spacy version 3, which comes with a friendly api to work with transformer models and do easy preprocessing of the data
Detailed Description
If we go with spacy, Snorkel has compatibility with it out of the box. However, it is limited to version 2 and depending of our needs, we can do the pull request to facilitate the implementation spacy v3 in Snorkel. Although, we can go without having to do it.
Still, we can create a labelling function with our fine-tuned transformer model and use it as a custom labelling function while using Snorkel's implementation of spacy for the preprocessing needed.
Another thing to consider is the way to do inference, having in mind the high run-time cost in production of a transformer model, with the fine-tuned transformer model being used as a labelling function:
- Do batch inference of the labels
- Use a lightweight transformer model variation of BERT like ALBERT. huggingface has an example of how to implement it too