Skip to content

robot-learning-freiburg/AmbRes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Toyota Motor Europe NV/SA and its affiliates retain all intellectual property and proprietary rights in and to this software, related documentation and any modifications thereto. Any use, reproduction, disclosure or distribution of this software and related documentation without an express license agreement from Toyota Motor Europe NV/SA is strictly prohibited.

Robotic Task Ambiguity Resolution via Natural Language Interaction

arXiv | website

Repository providing the source code for the paper

Robotic Task Ambiguity Resolution via Natural Language Interaction
Eugenio Chisari, Jan Ole von Hartz, Fabien Despinoy, Abhinav Valada

Please cite the paper as follows:

@article{chisari2025robotic,
  title={Robotic Task Ambiguity Resolution via Natural Language Interaction},
  author={Chisari, Eugenio and von Hartz, Jan Ole and Despinoy, Fabien and Valada, Abhinav},
  journal = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2025}
}

Installation

conda create --name ambres_env python=3.10
conda activate ambres_env
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
pip install -e .

SAM2

bash bash/download_weights.sh
pip install git+https://github.com/facebookresearch/sam2.git

Dataset download

Download the dataset from this link. Then extract it under the folder ~/datasets/.

Pre-trained checkpoints

To download the pre-trained checkpoints run

bash bash/download_ckpts.sh

Evaluation

Note that inference will require a GPU with at least 20 GB of RAM. To reproduce all results reported in Table I of the paper, run the following evaluations:

python scripts/evaluate_ambres.py --env real --model_type prompt
python scripts/evaluate_ambres.py --env sim --model_type prompt
python scripts/evaluate_ambres.py --env real --model_type finetune
python scripts/evaluate_ambres.py --env sim --model_type finetune
python scripts/evaluate_knowno.py --env real
python scripts/evaluate_knowno.py --env sim

Fine-tuning your own models

Note that training will require GPUs with at least 46 GB of RAM. To start a training run use the following command:

deepspeed --include localhost:0,1 scripts/train.py --env real
  • The --include localhost:0,1 flag is used to limit training to GPUs 0 and 1. Leave this out if you wish to use all GPUs available. See this doc for more information.

  • The --env flag can take argument sim or real. This flag determines which dataset you will train the model on: either on the simplified simulated images, or the real world images.

  • If someone else is using distributed training, you will need to change the port:

    deepspeed --master_port 12344 --include localhost:0,1 scripts/train.py
  • Note that to evaluate your own models, you will have to change the checkpoint name at ambiguity_resolution/ambres/__init__.py, in the CKPT class.

License

See the LICENSE file for details about the license under which this code is made available.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published