This software is made available for research and educational purposes only. Commercial exploitation is strictly forbidden. Researchers interested in collaborative projects or commercial applications are encouraged to contact the author to explore partnership opportunities.
Full dataset: "AI Blob Dataset: Transcribed and Semantically Embedded Italian Television Archive (Video Metadata, Sentence Annotations, Vector Store)"
available at at this link https://zenodo.org/records/15071951
Install also lid.176.bin from https://fasttext.cc/docs/en/language-identification.html and put it in the root folder.