simple_dqn_tf2.py Doesn't allow for multiple return actions #56

MrDomoArigato · 2023-02-08T06:12:39Z

If you try to change the n_actions parameter then when the model trys to learn it will fail

164/164 [==============================] - 0s 998us/step
164/164 [==============================] - 0s 887us/step
[[[nan nan nan ... nan nan nan]]

 [[nan nan nan ... nan nan nan]]

 [[nan nan nan ... nan nan nan]]

 ...

 [[nan nan nan ... nan nan nan]]

 [[nan nan nan ... nan nan nan]]

 [[nan nan nan ... nan nan nan]]] [   0    1    2 ... 5245 5246 5247] [list([2, 2, 5]) list([2, 1, 6]) list([3, 0, 6]) ... list([3, 0, 7])
 list([3, 8, 5]) list([3, 0, 3])]
Traceback (most recent call last):
  File "main.py", line 30, in <module>
    agent.learn()
  File "simple_dqn_tf2.py", line 95, in learn
    self.gamma * np.max(q_next, axis=1)*dones
ValueError: operands could not be broadcast together with shapes (5248,82) (5248,)

This definitely has to do with the shape of the stored action. I'm just not sure how to fix it.

5248 = n_actions * batch_size
82 = n_actions

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simple_dqn_tf2.py Doesn't allow for multiple return actions #56

simple_dqn_tf2.py Doesn't allow for multiple return actions #56

MrDomoArigato commented Feb 8, 2023 •

edited

Loading

simple_dqn_tf2.py Doesn't allow for multiple return actions #56

simple_dqn_tf2.py Doesn't allow for multiple return actions #56

Comments

MrDomoArigato commented Feb 8, 2023 • edited Loading

MrDomoArigato commented Feb 8, 2023 •

edited

Loading