Improving the edition of thousands of facts into a transformer memory at once.
We utilized Python 3.9 with library versions specified in requirements.txt.
To replicate our experiments, download the necessary data and covariance matrices from this link. Without this data, the experiments will not function correctly.
- To reproduce the training with MEMIT using different datasets, and evaluating in English and Catalan you just need to execute:
bash cross_linguality.sh
which calls the command: python -m experiments.evaluate --alg_name=MEMIT --dir_name=MEMIT_all --hparams_fname=aguila.json --dataset_name={lang}_CF.json --language={lang} --language_eval={lang_ev} --num_edits=1000 --dataset_size_limit=7000 --use_cache --continue_from_run=run_000 and saves the results of each experiment in results/MEMIT_all/run_000.
- To reproduce the training with MEMAT and the attention head relevance, you just need to execute:
MEMAT_hyperparam_selection.sh
For particular experiments with MEMAT you just need to execute:
python -m experiments.experiments_all_in_one --alg_name=MEMIT --hparams=aguila.json --dir_name={dir_name} --dataset_size_limit=3000 --generation_test_interval=-1 --dataset_name={lang1}_CF.json --language={lang1} --language_eval={lang2} --num_edits=1000 --use_cache --save_attention_heads="./{name_id}_heads_$top_num_heads_i" --top_num_heads=$top_num_heads_i --add_attention --save_figure="./figures/head_relevance/{lang1}{lang2}.pdf"
After running the experiments, summarize the content that is present in ./results using the main function of experiments.summarize. Refer to the provided script Figure_9.py for guidance.
- To obtain the data from Figure 4 you need to use and load the head positions and the head correction vectors obtained from a first block of 1000 examples using
--head_indand--load_attention(you will also need to remove the data used for the further experiments). Once you run bashMEMAT_hyperparam_selection.sh, you will obtain the following files:
["./catalan_CC_heads_16_iter_0_languageT_catalan", "./catalan_CE_heads_16_iter_0_languageT_english", "./english_EC_heads_16_iter_0_languageT_catalan", "./english_EE_heads_16_iter_0_languageT_english"]
["./indices_iter_0_languageT_catalan_numH_16.npy", "./indices_iter_0_languageT_ce_numH_16.npy", "./indices_iter_0_languageT_ec_numH_16.npy","./indices_iter_0_languageT_english_numH_16.npy"]
What you need to do is loading each of them and use:
python -m experiments.experiments_all_in_one --alg_name=MEMIT --hparams=aguila.json --dir_name=CC --dataset_size_limit=3000 --generation_test_interval=-1 --dataset_name=catalan_CF.json --language=catalan --language_eval=catalan --num_edits=1000 --use_cache --load_attention="./catalan_CC_heads_16_iter_0_languageT_catalan" head_ind="./indices_iter_0_languageT_catalan_numH_16.npy" --top_num_heads=16 --add_attention
- To explore the tokenization of the subjects we also leave the following piece of code that generate some figures in ./figures/analysis_data:
python -m analysis_data.analysis
- To explore how ITI fails in improving performance, execute:
bash evaluate_ITI_fails.sh
- To explore the different approaches to introduce both languages at the same time execute:
bash bilingual_approach_section.sh
If you want to avoid some extra time in computing the inverses implied in matrices --save_delta_matrix and --load_delta, but this will imply an extra requirement of space.