Skip to content

phase-based Observations, Rewards, Coupling Ablation

License

Notifications You must be signed in to change notification settings

aileenliao03/ORCAgym_aileen

 
 

Repository files navigation

ORCAgym

This repository is a bare-bones version of gpuGym for our ICRA 2024 paper Learning Emergent Gaits with Decentralized Phase Oscillators: on the role of Observations, Rewards, and Feedback, which is a port of legged_gym from the good folk over at RSL (code, website, paper).
A more up-to-date repo is available at pkGym, which includes the oscillator implementation, but will eventually diverge from the paper resuls (and does not include some of the analysis scripts). We recommend that repo for new projects; compared to RSL legged_gym, this fork differs in some substantial refactoring, which makes exploring different implementations easier.
Feel free to open issues both about the code and the paper.

Reading the Code

The overall code is organized in a similar fashion to legged_gym, in case you're familiar with it. We recommend first reading the config file, gym/envs/mini_cheetah/mini_cheetah_osc_config.py. In particular, search for:

  • control to see/change PD gains and control frequency
  • osc to see/change oscillator parameters
  • policy to see
    • neural network details
    • actor and critic observations
    • actions
    • weights for reward weights
  • algorithm for PPO hyperparameters Next, see the file gym/envs/mini_cheetah/mini_cheetah_osc.py for implementation details on the reward (all starting with def _reward_XXX() where XXX matches the weights name in the config file). Oscillator-related code is in the functions compute_osc_slope() (see equation 5 in paper) and _step_oscillators(). Gait-similarity rewards were used for internal evaluation but not used in the paper.

Reproducing Paper Results

Evaluated trained ORC Policies from the Paper

  • All scripts are in "gym/scripts"
  • You can either download the pre-trained policies here, or train them locally by running python train_ORC_all.py (in "gym/scripts"), which will iterate over the training with all ORC combinations once. For the entire set of policies used in the paper, this was run 10 time (or rather, change l48 to all_toggles = 10*['000', '010', '011', '100', '101', '110', '111']), but this will take a long time, roughly overnight on a GTX4090...
    • If you download the policies, make a directory "logs" inside ORCAgym, and copy all subfolders ("ORC_xxx_FullSend") into it
  • cd into ORCAgym/gym/scripts folder
  • python play_ORC.py --task=mini_cheetah_osc --ORC_toggle=<xxx>
    • By default the loaded policy is the last model of the last run of the experiment folder corresponding to the ORC_toggle
    • Other runs/model iteration can be selected by setting --load_run and --checkpoint.

Running disturbance rejection experiments (Section IV.C)

  • to generate the data used to evaluate disturbance rejection, run python ORC_protocol_pushball.py.
    • You may want to reduce the number of robots, by default we simulate 1800 robots (used in the paper) which is quite slow
  • to evaluate the results, run python ORC_pushball_analysis.py

Training

python gym/scripts/train.py --task=mini_cheetah_osc

  • To run on CPU add following arguments: --sim_device=cpu, --rl_device=cpu (sim on CPU and rl on GPU is possible).
  • To run headless (no rendering) add --headless.
  • Important: To improve performance, (if not used headless) once the training starts press v to stop the rendering. You can then enable it later to check the progress.
  • The trained policy is saved in <gpuGym>/logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt. Where <experiment_name> and <run_name> are defined in the train config.
  • The following command line arguments override the values set in the config files:
  • --task TASK: Task name, in our case mini_cheetah_osc.
  • --resume: Resume training from a checkpoint
  • --experiment_name EXPERIMENT_NAME: Name of the experiment to run or load.
  • --run_name RUN_NAME: Name of the run.
  • --load_run LOAD_RUN: Name of the run to load when resume=True. If -1: will load the last run.
  • --checkpoint CHECKPOINT: Saved model checkpoint number. If -1: will load the last checkpoint.
  • --num_envs NUM_ENVS: Number of environments to create.
  • --seed SEED: Random seed.

Installation

  1. Create a new python virtual env with python 3.8
  2. Clone and initialize this repo
    • clone gpu_gym
  3. Install GPU Gym Requirements:
pip install -r requirements.txt
  1. Install Isaac Gym
    • Download and install Isaac Gym Preview 4 (Preview 3 should still work) from https://developer.nvidia.com/isaac-gym
      • Extract the zip package
      • Copy the isaacgym folder, and place it in a new location
    • Install issacgym/python requirements
    cd <issacgym_location>/python
    pip install -e .
  2. Run an example to validate
    • Run the following command from within isaacgym
    cd <issacgym_location>/python/examples
    python 1080_balls_of_solitude.py
    • For troubleshooting check docs isaacgym/docs/index.html
  3. Install gpuGym
    pip install -e .
  4. Use WandB for experiment tracking - follow this guide

CUDA Installation for Ubuntu 22.04 and above

Inspired by: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

  1. Ensure Kernel Headers and Dev packages are installed
sudo apt-get install linux-headers-$(uname -r)
```****

2. Install nvidia-cuda-toolkit
```bash
sudo apt install nvidia-cuda-toolkit
  1. Remove outdated Signing Key
sudo apt-key del 7fa2af80
  1. Install CUDA
# ubuntu2004 or ubuntu2204 or newer.
DISTRO=ubuntu2204
# Likely what you want, but check if you need otherse
ARCH=x86_64
wget https://developer.download.nvidia.com/compute/cuda/repos/$DISTRO/$ARCH/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt-get install cuda
  1. Reboot, and you're good.
sudo reboot
  1. Use these commands to check your installation
nvidia-smi
nvcc --version

Troubleshooting Docs https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu-installation


About

phase-based Observations, Rewards, Coupling Ablation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 77.1%
  • Python 22.0%
  • Other 0.9%