Skip to content

How to use

Steve edited this page Feb 8, 2024 · 2 revisions

Getting started

Two example scripts are provided as scripts/train.py and scripts/play.py, which, as you can imagine, will run an RL pipeline, or load an existing policy and run it. To use, try out:

python scripts/train.py --task=mini_cheetah --headless

and once this has finished,

python scripts/play.py --task=mini_cheetah

You should be able to move the mini-cheetah robot around with the WASD keys, see the terminal output for instructions.

When training, the neural network model is saved in logs/<experiment_name>/<run_name>/, along with a snapshot of the entire repo. By default, the <experiment_name> is loaded from the task config file, and the <run_name> is generated with the date and time. When playing, the most recently created model matching the task and experiment is loaded.

The argument --headless simply runs train (or play) without a visualization; you can also train with visualization (which is a lot slower), and pause the visualizer by pressing v.

Command line arguments

There are several other arguments. For a complete list, see gym/utils/helpers; the most relevant ones are:

  • --task: Specifies which robot/config to load and run.
  • --headless: Force display off at all times.
  • --resume: Resume training (loads a saved policy).
  • --load_run: Name of the run to load when --resume=True. Defaults to the most recent.
  • --experiment_name: Name of the experiment to run or load. Defaults to the most recent.
  • --record: Record IsaacGym simulation at real-time speed. Note, esp. on ubuntu, use VLC (some media players don't handle it well).
  • --original_cfg: When loading a policy, use the original config file used to train it instead of the current one.
  • --checkpoint: Saved model checkpoint number. Defaults to the most recent.
  • --run_name: assign custom name to the run (for folder save-name and wandb tracking)
  • --num_envs: Override number of environments to create.
  • --seed: Set the random seed.
  • --max_iterations: Set the number of training iterations.
  • --wandb_project: Enter the name of your project for WandB tracking.
  • --wandb_entity: Enter your wandb entity username to track your experiment on your account.
  • --wandb_sweep_id: Enter a WandB sweep ID to continue an existing sweep.
  • --wandb_sweep_config: Enter the name of a JSON config for the WandB sweep.
  • --disable_wandb: Disable WandB logging for debugging.
Clone this wiki locally