This repository is a port of legged_gym from the RSL research group. It contains the code and results of the paper "Benchmarking Potential Based Rewards for Learning Humanoid Locomotion", located here: https://ieeexplore.ieee.org/abstract/document/10160885/
- Create a new python virtual env with python 3.6, 3.7 or 3.8 (3.8 recommended)
- Install pytorch 1.10 with cuda-11.3:
pip install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
- Install Isaac Gym
- Download and install Isaac Gym Preview 3 (Preview 2 will not work!) from https://developer.nvidia.com/isaac-gym (extract the zip package, copy the isaacgym folder within the package whereever you want it to live - I prefer in the directory with my virtual enviornment)
cd isaacgym_lib/python && pip install -e .
to install the requirements- Try running an example
cd examples && python 1080_balls_of_solitude.py
(you need to execute the examples from the examples directory) - For troubleshooting check docs
isaacgym/docs/index.html
)
- Clone and initialize this repo
- clone
gpu_gym
, then init the submodules: cd gpu_gym && git submodule init && git submodule update
- clone
- Install gpu_rl (PPO implementation)
cd gpu_rl && pip install -e .
- Install gpuGym
cd .. && pip install -e .
- Install WandB for experiment tracking - follow this guide
pip install wandb
- Each environment is defined by an env file (
legged_robot.py
) and a config file (legged_robot_config.py
). The config file contains two classes: one containing all the environment parameters (LeggedRobotCfg
) and one for the training parameters (LeggedRobotCfgPPo
). - Both env and config classes use inheritance.
- Each non-zero reward scale specified in
cfg
will add a function with a corresponding name to the list of elements which will be summed to get the total reward. - Tasks must be registered using
task_registry.register(name, EnvClass, EnvConfig, TrainConfig)
. This is done inenvs/__init__.py
, but can also be done from outside of this repository.
- Train:
python gpugym/scripts/train.py --task=pbrs:humanoid
- To run on CPU add following arguments:
--sim_device=cpu
,--rl_device=cpu
(sim on CPU and rl on GPU is possible). - To run headless (no rendering) add
--headless
. - Important: To improve performance, once the training starts press
v
to stop the rendering. You can then enable it later to check the progress. - The trained policy is saved in
gpugym/logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt
. Where<experiment_name>
and<run_name>
are defined in the train config. - The following command line arguments override the values set in the config files:
- --task TASK: Task name.
- --resume: Resume training from a checkpoint
- --experiment_name EXPERIMENT_NAME: Name of the experiment to run or load.
- --run_name RUN_NAME: Name of the run.
- --load_run LOAD_RUN: Name of the run to load when resume=True. If -1: will load the last run.
- --checkpoint CHECKPOINT: Saved model checkpoint number. If -1: will load the last checkpoint.
- --num_envs NUM_ENVS: Number of environments to create.
- --seed SEED: Random seed.
- --max_iterations MAX_ITERATIONS: Maximum number of training iterations.
- To run on CPU add following arguments:
- Play a trained policy:
python gpugym/scripts/play.py --task=pbrs:humanoid
- By default the loaded policy is the last model of the last run of the experiment folder.
- Other runs/model iteration can be selected by setting
load_run
andcheckpoint
in the train config.
- If you get the following error:
ImportError: libpython3.8m.so.1.0: cannot open shared object file: No such file or directory
, do:sudo apt install libpython3.8
- If you get the following error:
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error?
, try restarting your computer.
- The contact forces reported by
net_contact_force_tensor
are unreliable when simulating on GPU with a triangle mesh terrain. A workaround is to use force sensors, but the force are propagated through the sensors of consecutive bodies resulting in an undesireable behaviour. However, for a legged robot it is possible to add sensors to the feet/end effector only and get the expected results. When using the force sensors make sure to exclude gravity from trhe reported forces withsensor_options.enable_forward_dynamics_forces
. Example:
sensor_pose = gymapi.Transform()
for name in feet_names:
sensor_options = gymapi.ForceSensorProperties()
sensor_options.enable_forward_dynamics_forces = False # for example gravity
sensor_options.enable_constraint_solver_forces = True # for example contacts
sensor_options.use_world_frame = True # report forces in world frame (easier to get vertical components)
index = self.gym.find_asset_rigid_body_index(robot_asset, name)
self.gym.create_asset_force_sensor(robot_asset, index, sensor_pose, sensor_options)
(...)
sensor_tensor = self.gym.acquire_force_sensor_tensor(self.sim)
self.gym.refresh_force_sensor_tensor(self.sim)
force_sensor_readings = gymtorch.wrap_tensor(sensor_tensor)
self.sensor_forces = force_sensor_readings.view(self.num_envs, 4, 6)[..., :3]
(...)
self.gym.refresh_force_sensor_tensor(self.sim)
contact = self.sensor_forces[:, :, 2] > 1.