Shield_MARL

This repository contains the code for the Safe multi-agent reinforcement learning via shielding paper. Note that some of the shield files are not provided, they need to be synthesized first using Slugs.

Prerequisites:

Python 3.6+
gym
matplotlib 3.0.0
multi-agent gridworld for gridworld experiments
The particle environment for the deep MARL experiments in the paper (modified to be discretized - code missing). The CM3 Cooperative Navigation scenario I used can be found here and the config files for Cross (with 0.2 instead of 0.15) and Antipodal.

Code Structure:

CQLearning.py: contains an implementation of CQ-Learning (non-deep) following the Game Theory and Multi-agent Reinforcement Learning book.
GridShield.py: contains the implementation of the composed shielding method currently restricted to 2 agents per shield but code can be modified to accomodate more.
QLearning.py : contains an implementation of Q-Learning (also non-deep).
Shield.py: contains the implementation of the centralized shield method.
parsing.py: contains the options for running the code.
smoothing.py and plotting.py: for smoothing traces and plotting the accumulated rewards.
run_exp_CQ.py and run_exp_QL.py: to run experiments with QLearning or CQLearning. For example: python run_exp_CQ.py -n 2 -p 1 -g 1 -i 10 -h 7 -q 0.12 -a 0.8 -w 1 -r 20 -t 1200 -e MIT_test_1 runs an experiment with CQLearning with 2 agents, with shielding active, with composed shielding, for 10 iterations, with 7 runs saved as traces, with a CQ test threshold of 0.12, with a learning rate of 0.8, a discount of 1, collision cost of 20 and 1200 episodes and the relevant produced files will have experiment name: MIT_test_1.
/shields: contains the centralized shield files produced by the Slugs tool and the folder /shields/grid_shields are the shields necessary for the composed shield method for each map.
/shield_synthesis: contains the files relevant for shield synthesis:
- ControlParser.py (I am not the author of this code): converts slugs output files to a json shield file.
- gen_shield_grid.py: creates composed shields description files based on map and number of agents.
- compile_all_grid.sh: shell script file that takes a map name and number of agents and creates the shield files by calling the relevant code for composed shields.
- grid_world.py and grid_preprocessing.py (based on my co-author Suda Bharadwaj's code): code used by the gen_shield_grid.py script.
- maps.zip: contains the map files, (mpe for multiagent particle environment).
/maps: contains the map information for each grid map.
/graph_data: containes some traces that were generated when running experiments.

Notes:

Code is provided as is and not actively maintained at the moment. However, I am happy to answer questions.

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
graph_data		graph_data
maps		maps
shield_synthesis		shield_synthesis
shields		shields
.gitignore		.gitignore
.python-version		.python-version
CQLearning.py		CQLearning.py
CustomLogger.py		CustomLogger.py
GridShield.py		GridShield.py
QLearning.py		QLearning.py
README.md		README.md
Shield.py		Shield.py
env_test.py		env_test.py
grid_test.py		grid_test.py
main.py		main.py
parsing.py		parsing.py
plotting.py		plotting.py
prep.sh		prep.sh
requirements_shieldmarl.txt		requirements_shieldmarl.txt
run_collision.sh		run_collision.sh
run_collision_var_CQ.sh		run_collision_var_CQ.sh
run_collision_var_QL.sh		run_collision_var_QL.sh
run_exp_CQ.py		run_exp_CQ.py
run_exp_QL.py		run_exp_QL.py
run_hyperparam_graphs.sh		run_hyperparam_graphs.sh
run_hyperparam_search.sh		run_hyperparam_search.sh
run_hyperparam_search_2.sh		run_hyperparam_search_2.sh
smoothing.py		smoothing.py
test.png		test.png
test_maddpg.py		test_maddpg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shield_MARL

Prerequisites:

Code Structure:

Notes:

About

Releases

Packages

Languages

IngyN/Shield_MARL

Folders and files

Latest commit

History

Repository files navigation

Shield_MARL

Prerequisites:

Code Structure:

Notes:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages