Preparing Game Data Starcraft 2 (2026)

from pysc2.env import sc2_env from pysc2.agents import random_agent env = sc2_env.SC2Env( map_name="AbyssalReef", players=[sc2_env.Agent(sc2_env.Race.random)], step_mul=8 )

build_order_vector = [] for second in [60, 120, 180, 240, 300]: units_at_time = [e for e in replay.events if e.second <= second and e.name == 'UnitBornEvent'] build_order_vector.append(len([u for u in units_at_time if 'Zergling' in u.unit_type_name])) Goal: Predict race & opening from first 3 minutes. Extraction Code import sc2reader import pandas as pd replay = sc2reader.load_file("replay.SC2Replay")

Example skeleton:

Here’s a comprehensive, step-by-step guide to for machine learning, replay analysis, or build order mining. 1. Understanding SC2 Data Sources You have three primary sources of game data:

import pandas as pd actions = [] for event in replay.events: if hasattr(event, 'second'): actions.append( 'time_sec': event.second, 'event_type': event.name, 'player': getattr(event, 'player', None), 'unit_type': getattr(event, 'unit_type_name', None), 'position': getattr(event, 'location', None) ) df = pd.DataFrame(actions) Create a time-aligned representation: every 5 seconds, record game state (supply, workers, army, buildings, resources). preparing game data starcraft 2

df.to_parquet('sc2_actions.parquet', compression='snappy') If you control the game (bot development):

| Source | Format | Use Case | |--------|--------|----------| | | Binary / MPQ archive | Full game state reconstruction, player actions, timings | | Live game state (via API) | JSON (via SC2API) | Real-time bot development, decision-making models | | Match history (Blizzard API) | JSON | Win rates, map stats, ladder ranking | from pysc2

Example save: