Skip to content

valerio98-lab/WorldModel_CarRacing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WorldModel CarRacing implementation for Car Racing environment

Installation and Running:

Installation:

Clone the repository to your local machine:

git clone https://github.com/yourusername/yourrepository.git

Install dependencies:

pip install -r requirements.txt

Running the Code:

The system can be run in two modes: training and evaluation (-t or -e). By default the system train the MDN layer and the VAE layer on cuda if available.

python main.py -t <save_model_path> [--only_vae true|false]
  • -t or --train: Activates training mode.
  • --save_model_path: Optional: Path to save the trained model. By default is model.pt
  • --only_vae: Optional; choose "true" to train only the VAE model (default is "false").

Example:

python main.py -t /home/model_output.pt --only_vae true 

Evaluation mode:

python main.py -e --save_model_path <save_model_path> --model_path <model_file> --model_type <vae|mdn> [--render]
  • -e or --evaluate: Activates evaluation mode.
  • --save_model_path: (Required) A placeholder path (required by the parser, even if not used for evaluation).
  • --model_path: (Required) Path to the model file to be loaded.
  • --model_type: (Required) Specify the type of model to evaluate; choices are "vae" or "mdn".
  • --render: Optional flag to render the environment during evaluation.

You have to provide the model type for evaluation mode(i.e. vae if your model is trained with VAE only, mdn if your model is trained with MDN)

Example:

python main.py -e dummy_path --model_path /home/valerio/Desktop/WorldModel_CarRacing/model_vae.pt --model_type vae --render

Disclaimer

When training is started, the system automatically generates a dataset of 300 episodes and trains both the VAE and MDN. These 300 episodes have proven sufficient to obtain a reasonably well-trained agent. Note: These values are hardcoded in the code (specifically in policy.py) and can be modified if needed (i.e. some preliminary test)

Only Vae Full model
Only Vae Full model

World Model Overview

  1. Dataset Generation

    • The CarRacing dataset was generated through episodic rollouts using random action sampling, with lazy loading to efficiently manage memory.
    • The original dataset (CarRacingDataset) was then transformed into a latent dataset (LatentDataset) for training the MDN-RNN.
  2. VAE Training

    • A Variational Autoencoder (VAE) was trained to compress environmental observations into a latent space using the reparameterization trick.
    • The loss function combined reconstruction loss and KL-divergence.
    • Key parameters: 10,000 episodes, 1,000 frames per episode, 10 epochs, and a learning rate of 1e-4.
  3. MDN-RNN Training

    • Using latent vectors produced by the VAE, an MDN-RNN (LSTM combined with a Mixture Density Network) was trained to model the temporal dynamics within the latent space.
    • Training utilized teacher forcing, with 25 epochs, a batch size of 32, and a learning rate of 1e-4.
  4. Controller Training

    • A simple linear layer was implemented as the controller, making decisions based on the latent state.
    • Two approaches were tested:
      • A VAE-only approach, using parallelized CMA-ES, which proved more stable and faster.
      • A full model approach (VAE + MDN) trained sequentially, which showed higher performance variability but occasional high rewards.
  5. Final Notes

    • A more complete report on the work can be found in the repository world_modl_report.

About

WorldModel CarRacing implementation for Car Racing environment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages