A2C

This project is implementation of synchronous brother of Asynchronous algorithm A3C. Improvement compared to policy alogirthms such as Actor critic or Reinforce is in multiple agents(instead of one) acting on copies of environment in their own way(they have different experiences with similiar policy==better policy value estimation). This reduces variance in updates on the other hand this algorithm requires more computation power.

This is paper on A3C https://arxiv.org/pdf/1602.01783.pdf. A2C is same, but instead of doing everything asynchronously we take all the experiences from all Agents and put them into one batch and do one update (which is shown by experimentation to have little to no effect on results compared to A3C)

This Project was done in past therefor library versions are unknown (I will add all the version requirements after some experimentation)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Nets.py		Nets.py
README.md		README.md
Runner.py		Runner.py
Wrappers.py		Wrappers.py
envs_vectorizer.py		envs_vectorizer.py
main.py		main.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A2C

About

Releases

Packages

Languages

DanielKarasek/A2C

Folders and files

Latest commit

History

Repository files navigation

A2C

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages