Trying out different RL algorithms on a simple windy grid-world for practice.
---------------------
| | | | | | | | | | |
---------------------
| | | | | | | | | | |
---------------------
| | | | | | | | | | |
---------------------
|*| | | | | | |g| | | ---> (*) -> Current Position, (g) -> Goal Position
---------------------
| | | | | | | | | | |
---------------------
| | | | | | | | | | |
---------------------
| | | | | | | | | | |
---------------------
0 0 0 1 1 1 2 2 1 0 ---> Wind Values for each column in positive y direction
Git clone this repository and import as
from gridworld import gridworld_agent
env = gridworld_agent(r, c, wind-vector, start, goal, action-mode)
r Grid Rows (int)
c Grid Columns (int)
wind-vector Wind values, python-list length c
start Start-position, python-list [m, n] 0 <= m < r , 0 <= n < c
goal Target-position, python-list [p, q] 0 <= m < p , 0 <= n < q
action-mode Valid arguments "king"
(8-directions movement) or "std"
(4-directions movement)
env.reset()
Return to initial state for a new episode.
env.printenv()
Renders the environment for visualisation.
state, reward, done = env.step(action)
Takes a step, action values ranging from 0-3 for "std" and 0-4 for "king". Returns (and updates) current position, reward obtained for taking that action and done, which is True
, if goal is reached, False
otherwise.