-
-
Notifications
You must be signed in to change notification settings - Fork 195
Description
Hi there,
I am working on including the Alsatian dialect (with its variants) as a zero shot language. To do so, I am updating the files for distance lookups so that the language embeddings can be updated.
I inspected the 'asp_dict.pkl' file and I assume that for each language, the associated values are the ASP values between that language and every other language.
As I want to include Alsatian, I therefore need to calculate the same values. To do so however, based on the ASPF formula in Do et al. (2023):
where PFA is the vector of frequencies of language A
I need the phoneme frequencies for all the other languages the model can perform inference on. Is there a file somewhere where these frequencies are stored? For instance, one that was used to create the 'asp_dict' file?’
Or if this metric was calculated in another way, please let me know!
Thanks in advance,
Isobelle Miles