This blog post will introduce you to the StarCraft 2 Learning Environment (SC2LE) and is written for the StarCraft AI Workshop at the IT University in Copenhagen January 20th – 21st 2018. This guide heavily relies on existing guides, documentation and papers, and simply points you in good directions.
First of all, if you are new to the game read the introduction on TeamLiquid and play the game yourself.
1. Overview of the SC2LE
SC2LE consists of a number of components. The newer versions of StarCraft II comes with a built-in API (also called the StarCraft II API) that can be used to communicate with the game. PySC2 is a client that allows Python code to communicate with the API and this client will be the main focus of this guide. PySC2 is a machine learning framework and it does not implement all of the underlying features in the StarCraft II API. From it, you can get spatial features from the mini-map and the screen as well as non-spatial features such as resources, available_actions, unit positions etc., and you can call actions that mimic how humans interact with the game.
The figure below shows examples of how agent and human actions differ and how they influence which actions that are available in the next frame.
2. Download StarCraft and Replays
First, you need to download and install the game. Also, download the replay packages if you are going to use that.
Getting the Game
On Mac and Windows, you should download the game from Blizzard’s Battle.net. If you have purchased it, make sure it’s downloaded. You can also install their free version of the game. Install the game in the default location. On Windows, that will be C:/Program Files (x86)/StarCraft II/.
On Linux, download the Linux packages instead. Version 3.16.1 seems to be a good choice at the moment. Also, download the map packs.
Getting the Replays
You can download the replays packs here: https://github.com/Blizzard/s2client-proto#linux-packages. The first pack is smaller (2,61 Gb zipped, 7.5 Gb extracted) and is a good start for small experiments. The second pack is much larger (44.2 GB zipped). Go ahead and grab one or both. I never managed to download pack 2 at the download stalled several times. Place the files in /StarCraftII/Maps.
3. Setup your Developer Environment
Make sure you have Python 2 or 3 installed on your machine. You can download Python here https://www.python.org/downloads/.
Make sure that pip is installed. It’s used to install Python packages. Type “pip” (or “pip3” if you use Python 3) in your terminal to check. Otherwise, get it here https://pip.pypa.io/en/stable/installing/. You might need to run “pip” with sudo/administrative rights to install things.
Make sure you have a git client installed.
4. Setup up StarCraft and PySC2
To install PySC2, simply use pip:
sudo pip install pysc2
And to verify that everything is working:
python -m pysc2.bin.agent --map Simple64
If the map could not be found, make sure that the map Simple64 is in StarCraftII/Maps/ somewhere. The maps can be downloaded here.
5. Create a Scripted Bot using PySC2
The random agent in PySC2 is a good starting point. You can launch a game and the –agent argument like this:
python -m pysc2.bin.agent --map Simple64 --agent pysc2.agents.RandomAgent
The RandomAgent is also the default agent.
Notice how observations are retrieved:
where “available_actions” is the name of the observation that is being requested. Other examples of observations are:
obs.observation["screen"]["height_map"] # Shows the terrain levels. obs.observation["screen"]["player_id"] # Who owns the units, with absolute ids. obs.observation["minimap"]["visibility"] # Which part of the map are hidden, have been seen or are currently visible. obs.observation["minimap"]["camera"] # Which part of the map are visible in the screen layers.
Actions are called by calling and returning:
return actions.FunctionCall(<function_id>, <args>)
E.g. the function id for move_camera is 1 and that function takes two integers in an array argument (mini-map coordinates).
You can read more about observations and actions in the documentation.
I recommend following Steven Brown’s tutorial Building a Basic PySC2 Agent to get started with a Terran bot in Python that collects resources, builds marines and attacks the opponent. Later in his tutorials, you’ll learn to extend it to use tabular Q-learning.
Better Alternatives for Scripted Bots
The PySC2 interface can be a bit tricky to work with and if you are creating a scripted bot (as it is intended for machine learning). Here are some alternatives that allow us to use everything in the StarCraft II API.
Python-sc2 (requires python 3.6+) allows you to use the ‘raw API’ which allows for more abstract actions. It has documentation and a few good examples of some rushing strategies. For some reason, it crashed on my machine and I have created an issue in the repository. The author claims that it works, so go ahead and try if it works for you (and please comment on this post if it does).
Blizzards’ C++ API is a good option if you want to write a scripted bot. This tutorial (Windows only) by Don Thompson walks you through how to use it with some sample code. If you don’t want to start from scratch, you can expand on the CommandCenter bot created by David Churchill. The Overseer project can be used to analyze the map and do region decomposition.
.NET (C#, F#, Visual Basic)
The Starcraft 2 Client API for .NET allows you to write you bot in C#, F#, and Visual Basic. There may exist similar projects for other languages.
When your bot works, you can upload it to the online ladder. I would suggest that you do a little bit more than what’s in the tutorial before you submit it.
6. Extract Data from Replays
Replay files can be replayed using the following command:
python -m pysc2.bin.play --replay <path-to-replay>
If it doesn’t work, make sure you that the StarCraft version in your /StarCraftII/ folder has the same version as the replays files. You also need to have the maps and replays correctly placed in the StarCraftII folder.
pysc2-replay is a simple tool that allows you to extract the game state while playing a replay file. The link points to my own fork of pysc2-replays because I’ve made some fixes.
MSC is a processed data set extracted from the StarCraft 2 replay packages. The data set contains state-action pairs with macromanagement decisions such as when buildings, units, and upgrades were produced. A build order planner could be learned from this dataset using the approach in one of my own papers.
7. Debug and Inspect the Observation Object
When you have pysc2-replay running it is a good opportunity to explore the observation object.
- Open the project in your IDE
- Open the pysc2-replay project
- Select the file transform_replay.py
- Set the Script parameters to: –replay “<replay-file>” –agent “ObserverAgent.ObserverAgent” and apply.
- Right click on transform_replay.py and click “Debug ‘transform_replay.py'”
- Open OserverAgent.py and set a breakpoint on line 5
- Mark ‘time_step.observation’, right click and click on ‘Add to watches’
- Now you can investigate the observation object more closely. Look at the documentation while you do this, so you know what you are looking at.
8. Supervised Learning with Keras
Supervised Learning (SL) is the task of learning to map examples from a data set to classes (classification) or real values (regression). We could thus take a SL approach to learning from examples in the replay data set to imitate human players.
A popular SL approach is to use a multi-layered neural network as the mapping-function we are trying to learn. The network can be trained using backpropagation and gradient descent. Training deep (many-layered) neural networks like this is also called Deep Learning.
If you are new to deep learning, SL is a good start. I would recommend using Keras as it is a simple high-level framework (you avoid writing a lot of code) in Python. The installation guide and introduction are on their homepage, and they also have good tutorials and examples. This tutorial shows you how to load data from a .csv file. When that works, you can generate .csv files from data in replays.
You could try to predict the winner of a game or the players next action.
You might want to use convolutional layers to learn from the spatial features available in pysc2. They are also available in Keras.
SC2LE comes with a number of Mini-games which are nothing more than a .map file and a map-config file. These mini-games can be useful for reinforcement learning as they are much simpler to learn than the actual game.
You must download the mini-game maps in order to use them. Place them somewhere within /StarCraftII/Maps/.
You can run a mini-game with the following command:
python -m pysc2.bin.play --map CollectMineralShards
10. (Deep) Reinforcement Learning
If Reinforcement Learning (RL) and/or deep learning is new to you, but you still want to play around with RL, I recommend going back to the tutorials in step 6, and proceed to the Q-learning part. That tutorial will teach you how to implement a build order strategy that uses tabular Q-learning.
If you rather want to apply deep reinforcement learning, you can either try to:
- Train an agent to play one of the mini-games using raw actions.
- Train an agent to play the entire game with abstract actions similar to the tutorial in step 5, but use a neural network instead of a Q-table.
- Train an agent to play the entire game using raw actions. This is extremely challenging. I think DeepMind have some preliminary results in their paper you can be inspired by. I recommend using the Simple64 map!
If you are looking for existing implementations of RL algorithms I recommend this repository (and tutorial) that contains A2C and DQN which are already hooked up to PySC2. Remember to install python-dev (or pythonx.x-dev for your python 3.5, 3.6 etc.) with pip, otherwise it will not install correctly.
Jonas Busk have implemented a few RL algorithms (DQN, Policy Gradient and Actor-Critic) as Jupyter notebooks with some good explanations.
Below are some ideas on how to extend or improve on the already implemented RL algorithms.
It is hard for RL agents to learn in environments with sparse rewards such as the win/loss feedback after a game. Rewards shaping is when you use domain knowledge to create a more ‘helpful’ reward function that can smoothly guide the agent in the right direction. You can e.g. reward the agent for producing buildings, units, and upgrades and when it kills enemy units.
Learning to play StarCraft from scratch may be too ambitious. Instead, you can use curriculum learning to train the agent on increasingly hard tasks, e.g. first to mine minerals, then produce marines, then attack and win. You can also train it on small maps first and then larger maps or against simple agents and later harder agents.
Hierarchical Reinforcement Learning
Humans learn how to solve small simple tasks, such as turning on the oven, and also how to combine the simple tasks to solve more complex tasks, such as preparing a dinner. A hierarchical reinforcement learning (HRL) agent learns a number of sub-policies to solve individual tasks as well as another (meta) policy that learns when to follow each sub-policy. This idea could be useful in a complex game like StarCraft.
One example of an HRL algorithm is meta-learning shared hierarchies (MLSH) which is visualized in the figure below. Open AI have published a blog post, a paper, and their code that I recommend you to follow if you want to learn more.
That was all! Feel free to comment 🙂