Homework and Report from the course Reinforcement Learning at Sorbonne University, held by Sylvain Lamprier
We tested several algorithms on diverse environments. For example, we examined the environment "LunarLanderContinuous-v2". There, the algorithm tries to learn landing a flying object. It is created after the video game Lunar Lander. On the following picture, one can see a screenshot of this environment.
We tested the Soft-Actor-Critic Agent on this environment and obtained the following the reward curve:
Hence, we see that the algorithm works very well.
More details on this algorithm and other ones can be found in the report.