The following instruction setup the environment using conda:
conda create -n benchmark python=3.11
conda activate benchmark
pip3 install -r requirements.txt --no-cache-dir
There are 2 configurations files that need to be checked before running experiments and they are benchmark.yaml and models.yaml.
The models.yaml specifies what models are going to be used for inference. An example structure is shown below:
SmolVLM:
model_id: HuggingFaceTB/SmolVLM-500M-Instruct
type: Vision2Seq
parameters:
max_new_tokens: # this is None so the default hugginface value will be used
repetition_penalty:
temperature: 0.7
top_k: 2
top_p: 0.6model_name: name of the model you want to use (can be anything)model_id: huggingface model idtype: can beImageText2TextorVision2Seq, for custom types checkgemma3n.pyand others in the./vlm/models/*folder.parameters: this field is optional, but if present add at least one parameter (e.gtop_p,temperature, etc..)
The benchmark.yaml specifies 2 things:
test: specifies whichmodelsare going to be used for predictions using thedatasetfor thetask- the
modelsspecified need to be present in themodels.yamlfile
- the
tasks: specifies thetasksand the system and userpromptto make for the models. It also specifies the dataset for that specific task.