Skip to content

Exercises of the reinforcement learning course from Hugging Face

Notifications You must be signed in to change notification settings

chavicoski/Deep-RL-course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Course Image

In this repo you can find my implementation for exercises of the Deep Reinforcement Learning Course from Hugging Face.

MY IMPLEMENTATION

The original course material is implemented in notebooks for Google Colab. As I am lucky enough to have a good computer and I like to program my experiments in scripts, I have implemented the exercises locally using Docker and Python scripts.

REQUIREMENTS

As I am using a Docker image to handle the dependencies of each unit/exercise, the unique requirements are:

Note: The Docker Compose is not an strict requirement, as you could run the containers with just Docker, but it is a convenient tool to handle the containers setup.

UNITS

You can find the exercise for each unit in his respective folder. Here is a brief summary of each one:

  • Unit 1: A general introduction to Reinforcement Learning, where you can learn the basic concepts. In the exercise you can train an agent controlling a simple spaceship to land on the moon.

  • Unit 1 Bonus: In this extra unit you can train Huggy (the dog) to fetch a stick in a Unity environment and the ML-Agents toolkit.

  • Unit 2: Q-Learning model explanation. In the exercise you can train a Q-Learning agent (implemented from scratch) to play in two different environments: Frozen Lake v1 and Taxi v3.

  • Unit 3: Deep Q-Learning model explanation. In the exercise you can train a Deep Q-Learning agent to play Atari games using the game frames and Convolutional Neural Network (CNN).

  • Unit 4: A review of Policy-based methods. In the exercise you have to implement the Policy Gradient algorithm using Pytorch to play in two different environments.

  • Unit 5: Introduction to the fundamentals ot the ML-Agents toolkit. In the exercise you have to train agents for two Unity environments: SnowballTarget (created at Hugging Face) and Pyramids (created by the Unity team).

  • Unit 6: This unit explains a new algorithm called Actor-Critic, which is a combination of Value-Based and Policy-Based methods. In the exercise you can train agents for two robotic based environments.

  • Unit 7: An introduction to Multi-Agents Reinforcement Learning (MARL). In the exercise you have to train a MARL system to play soccer in a 2vs2 match. The environment is a Unity environment and the model is trained using ML-Agents.

  • Unit 8 part 1: Explanation of the Proximal Policy Optimization (PPO) algorithm. In the exercise you implement a PPO agent from scratch using Pytorch to play Lunar Lander environment.

  • Unit 8 part 2: Application of the Proximal Policy Optimization (PPO) algorithm in a VizDoom environment. The exercise uses the Health Gathering Supreme environment from VizDoom, and uses the Sample Factory library (focused on efficiency) to train the models with a high-throughput pipeline.

HOW TO RUN

In each unit folder you can find a README with the full explanation to run the code. Each unit has the code to train the models and push them to the Hugging Face Hub.

About

Exercises of the reinforcement learning course from Hugging Face

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published