I created this repo as a study aid for Deep Reinforcement Learning, following my enrollment in the corresponding Udacity Nanodegree.
My goal is to write a series of Jupyter Notebooks as if they were blog entries, each containing code for resolving a particular task in the openAI gym environments, detailed explanations of how the code works and short, 'take-home' notes on the underlying Reinforcement Learning concepts.
The process of documenting and explaining my code as if it were going to be read by others has been instrumental in structuring and clarifying my thinking. Furthermore, I have tried to use the notebooks to briefly recapitulate the underlying RL concepts and remind myself of the context and motivation for a particular approach, rather than focusing exclusively on implementation. I feel this exercise has promoted in me a deeper understanding of RL concepts and their context, in forcing me to consider a big-picture view of how everything I've learnt so far fits together. If you're enrolled on the same (or a similar) program, I would encourage you to avoid passive consumption of course content (regardless of its high quality) and adopt this proactive approach. Though slower, my experience has been that it is ultimately more rewarding in that it leads to deeper and more consolidated understanding.
See below for a table of contents of environments and concepts tackled. I will keep a running update on this.
Environment | Concepts | Date |
---|---|---|
FrozenLake | TD methods (Q-learning) | 21/02/2020 |
Blackjack | Monte Carlo Control | 23/02/2020 |
LunarLander/Box2D in general | Deep Q Networks | 25/02/2020 |
Unity Banana | Deep Q Networks, with detailed walkthrough | 03/03/2020 |
Cartpole | Policy Networks, Hill-Climbing, Steepest Ascent | 10/03/20 |
MountainCar | 2-layer Policy Networks, Cross-Entropy Method | 13/03/20 |
Pong! from pixels | Policy gradients, future rewards, REINFORCE | 24/03/2020 |
Reacher | Policy gradients, Experience Replay, DDPG | 01/04/2020 |
Tennis | Multi-agent RL, Policy Gradients, Experience Replay, DDPG | 07/04/2020 |