This library converts Starcraft2 replays into a sparse unit count table.
This library helps converting replay data into a format usable in data analysis
or in training reinforced ML bots.
It parses replay into a series of events and then makes a table in which you can
find a count of every unit, building or resource of each player in each tick of
the game.
Available functionality:
- Extract general replay data such as map name, players nicknames using ReplayData class.
- Create a dict of lists representing player's build order in the span of the game.
Limitations to consider:
- The only available game mode is 1v1.
- Made for game version
5.0.11
Python <= 3.9
(the latest sc2replay library is available in Python version 3.9)- Packages listed in
requirements.txt
- Clone the repository by running
git clone https://github.com/dvarkless/starcraft2_replay_parse.git
- Create a python virtual environment:
cd Classic-ML-Models
python -m venv venv
- If you are using Linux or Mac:
source ./venv/bin/activate
If you are using Windows:
./venv/Scripts/activate.ps1
- Install packages:
pip install -r requirements.txt
- Prepare a dataset, split it into training data, evaluation input and evaluation answers:
The example code to run sample replays are provided inprocess_replay.py
file.
Here is the minimum code to use the library by the intended way.
from replay_tools import BuildOrderData, ReplayData
replay = 'replay_name.SC2Replay'
replay_data = ReplayData().parse_replay(path).as_dict()
print(replay_data['map_name'])
>>> Gresvan
GAME_INFO_PATH = "./data/game_info.csv"
MAX_REPLAY_LEN = 30 * 60 * 16 # 30 minutes
STEP_LEN = 16 * 2 # write game state every ingame 2 seconds
replay_transformer = BuildOrderData(MAX_REPLAY_LEN, STEP_LEN, GAME_INFO_PATH)
for player_dict in replay_transformer.yield_unit_counts(replay_data):
print(player_dict) # There is two players in this game,
# so the method returns two dicts
>>> {
>>> 'Zergling': [0, 0, 0, 0, 0 ....],
>>> 'Drone': [12, 12, 13, 13, 13 ....],
>>> ...
>>> }
>>> ...
out = ReplayData.as_dict()
print(out)
out = {
"processed_on": datetime.timestamp,
"replay_name": str,
"expansion": str, # ['WoL', 'HotS', 'Lotv']
"frames": int, # Number of ticks the game has
"mode": str, # '1v1'
"map": str, # Hash value of the map
"map_name": str, # Map name (prefix and suffix excluded)
"matchup": str, # ZvT, ZvP, etc...
"winners": List[str], # Nickname of the winner
"losers": List[str], # Nickname of the loser(s)
"stats_names": str, # Players_data dick keys
"players": str, # Player nicknames
"players_hash": str, # Hash of two players nicknames,
# helps find identical replays
# with different names
"players_data": dict{ # Players info
'id': int,
'full_name':str, # name as in stats_names
'race': str,
'league': int, # 0-8, 0-unranked, 8-GM
'url': str, # link to battle.net account
'is_winner': bool,
},
"stats": dict { ... }, # Events
"league": int, # Min players league: 0-8, 0-unranked, 8-GM
}
out = BuildOrderData.yield_unit_counts(replay_data)
print(out)
# out = Generator[dict[...]]
player_1_dict = next(out)
The dict represents a sparse table where columns are defined in the game_info.csv
file. There is units and buildings, regardless of players' game race.
Additionally, there is a minerals and vespene counter.
The lists are the same length of either the game length in tick // parsing_step
or maximum game length, depending on which one is smaller.
Columns:
- Units
- Buildings
- Upgrades
- Resources available
Rows:
- Each position represents an entity count of the current type in the current tick.
- Each list in the dict has the same length.
- Type:
int
> 0
player_1_dict = {
terran_unit: list[int],
...,
zerg_upgrade: list[int],
...,
protoss_building: list[int],
...,
minerals_available: list[int],
vespene_available: list[int],
}
This project uses SC2Reader tool to parse replays
Distributed under the MIT License. See LICENSE.txt
for more information.