DiffVLA builds upon NAVSIM to enable
diffusion-driven end-to-end policy learning for language-conditioned autonomous driving.
This repository integrates pseudo-simulation from NAVSIM with diffusion-based trajectory generation.
We are keep working on E2E models and we will keep update the latest work list here:
[FlowDrive: Energy Flow Field for End-to-End Autonomous Driving] |
[IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model] |
The image backbone is from https://github.com/youngwanLEE/vovnet-detectron2 Please download the weights of VoV99 net in the upper repo.
Please update the following parameters in
navsim/agents/diffvla/diffvla_config.py:
| Parameter | Description |
|---|
| pdm_pkl_path | Path to your PDM GT files. |
| num_voc | Different number of trajectory vocabs. |
| plan_anchor_path | Path to the anchor files used for trajectory planning. |
| with_vlm | Define with or without meta action cmd from VLM. |
Example:
pdm_file_path = "diffvla_data_exp/pdm_scores_8192" num_voc = 8192 plan_anchor_path = "diffvla_data_exp/planning_vb/trajectory_anchors_8192.npy" with_vlm = True
In the new version, the trajectory head has been replaced with a reward-based transformer head.
Please ensure that:
The active trajectory head is switched to navsim/agents/diffvla/trajectory_head_reward.py
The model definition file navsim/agents/diffvla/transfuser_model.py has been updated to use the transformer-based trajectory head.
Make sure all paths in configuration files are absolute or relative to the project root.
Cached or generated training data should be excluded via .gitignore.
Below is a recommended structure to extend this guide later:
Run the following command to define env, cache dataset and start training:
bash diffvla_env.sh bash cache_dataset_diffvla.sh bash train_diffvla.sh
Edit train_diffvla.sh to the specify dataset cache path, GPUs, and hyperparameters. For 8 * H20 (Chinese limited edition of H100 which is more than 2 times slower than H100) GPUs , the training will takes 45min*30 to finish.
To evaluate the agent, one must create metric cache 1st. Run the cache script:
bash metric_cache.sh
You can define your interested data set for evaluation such as navtest or navhard two stage. The default is set to navhard two stage.
We offer a fast evaluation script to test the model duo to the original test scripts from NAVSIM are too slow (30 min vs 2 hour).
Run the evaluation script:
bash fast_test.sh
Edit fast_test.sh to the specific CACHE_PATH, CKPT, EXP_NAME, EXP_ID and AGENT
Evaluation results will be saved in the results/ directory by default.
Please notice that the fast test script will have EPDM scores slightly differ from the original evaluation script the difference related scores between official evaluation scirpt and the fast test script is less than 0.25%.
The released version has some modifications compared to the paper on arXiv:
- The trajectory head has been updated from Diffusion Drive to a self-developed Transformer-based Trajectory Head.
- A reward loss derived from multiple EPDM sub-metrics has been introduced to guide training.
The current DiffVLA training requires EPDM scores for the trajectory vocabulary defined in your model. We provide a Python script to generate EPDM scores and save them as .pkl files (named with tokens for each training sample).
Run: python navsim/misc/gen_multi_trajs_pdm_score_ours.py
Please update parameters inside navsim/misc/gen_multi_trajs_pdm_score_ours.py, such as metric cache paths, output directories, and vocabulary file paths.