Skip to content

v0.0.5, new examples, many bug fixes.

Latest
Compare
Choose a tag to compare
@zafarali zafarali released this 31 Jan 07:37
· 7 commits to master since this release
  1. Add the Cakeworld MDP from the action gap paper: Bellemare et al. 2015 #16
  2. Add the two circle MDP for Off policy from Zhang et al 2019 #16
  3. Change plotting functions so that the MDP plot in the same orientation as in the character matrix file #17
  4. Fix Transition matrix builder where if the agent was manually placed in a wall state there was a non-zero probability of it escaping. This was fixed in #17 along with associated tests. Walls now behave as absorbing states. As a side effect you should also see that value functions for wall states are zero.
  5. Fix a really nasty bug in utils.convert_one_hot_to_int that caused the downcasting of an action. This function now returns a normal int as opposed to a np.int8: 59c2064
  6. All tests are now passing again.