Skip to content

inference with new zero-shot language: phoneme frequencies for ASP #208

@imiles2021

Description

@imiles2021

Hi there,

I am working on including the Alsatian dialect (with its variants) as a zero shot language. To do so, I am updating the files for distance lookups so that the language embeddings can be updated.

I inspected the 'asp_dict.pkl' file and I assume that for each language, the associated values are the ASP values between that language and every other language.

As I want to include Alsatian, I therefore need to calculate the same values. To do so however, based on the ASPF formula in Do et al. (2023):

Image

where PFA is the vector of frequencies of language A

I need the phoneme frequencies for all the other languages the model can perform inference on. Is there a file somewhere where these frequencies are stored? For instance, one that was used to create the 'asp_dict' file?’

Or if this metric was calculated in another way, please let me know!

Thanks in advance,
Isobelle Miles

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions