This report summarizes the StarCraft AI Workshop held at the IT University of Copenhagen, January 20-21 2018. The purpose of this workshop was to bring together students and researchers with interests in AI and Machine Learning in StarCraft, and together explore the new StarCraft 2 Learning Environment (SC2LE). This report contains an overview of the event and descriptions of some of the projects made at the workshop.
We would like to give huge thanks to our sponsors for providing food, beverages, snacks, and prizes for the workshop!
I apologize for speaking Danish. It should be subtitled soon.
Applications & Participants
Participants had to send in applications and get accepted for the workshop. We did this to ensure that the participants had the sufficient requirements and to ensure that we had enough space at the venue. Initially, we were able to host 50 participants but a few days before the deadline we were able to accept all 113 applications. Due to the last-minute acceptances, several people had made other plans so roughly 75 showed up.
The workshop spanned two days during a weekend with the following program:
Day 1 – Saturday, January 20th:
08.45: Doors open
09:00: Welcome by Niels Justesen & Sebastian Risi
09:15: Invited talk – Mike Preuss
09.45: Invited talk – Jonas Busk
10.15: Short break
10.30: Introduction to SC2LE by Niels Justesen
12.00: Lunch sponsored by IT Minds
13.00: Group creation and group work
~20.00: End of day 1
Day 2 – Sunday, January 21st:
09.00: Doors open – Group work
15.00: Presentations, competitions, and prizes
17.00: End of day 2
We had two invited talks on Day 1.
Lessons learned: StarCraft I in competitions and research since 2010
Mike Preuss, Research Associate at the University of Muenster.
For several years, StarCraft I has been the most prominent test case for research in real-time strategy games (RTS). We report on the early developments and how bots evolved during the first years, including the problems still largely conceived as unsolved. We also provide some insight into strategy selection and build order optimization attempts, before developing some ideas for future directions in RTS research.
Deep Learning Primer … with a touch of reinforcement learning
Jonas Busk, Ph.D. Student at the Technical University of Denmark.
Jonas Busk gave a great introduction to Deep Learning covering topics such as feedforward neural networks, activation functions, backpropagation, overfitting, regularization, architectures, and Deep Q-Networks (DQN).
Workshop Material & SC2LE Introduction
After the invited talks, I gave a walkthrough of the material that I had prepared for the workshop. The material contains ten steps going from an overview of SC2LE, how to setup it up, write a scripted bot, parse replays and apply reinforcement learning to mini-games.
Group Presentation & Prizes
By the end of Day 2, seven groups presented their preliminary result. The two first groups won prizes for best projects. The other five projects are in the order as they presented. I apologize that I didn’t take a lot of notes, so the summary is based on my own memory and the resources that some of the participants sent me.
End-to-end Weight Sharing for StarCraft II (1st place)
The prize for the best project went to Tony Beltramelli. He designed a network architecture that was trained to imitate a scripted agent in the Move to Beacon mini-game. First, the scripted agent played a large number of games to generate data and then the network was trained with supervised learning. The network architecture had some interesting details. The network had x and y coordinate outputs instead of a spatial convolutional output-layer which was used by DeepMind. The network also had two streams of fully-connected layers, which perhaps could be combined. It’s also interesting that the network receives the available actions as input. This is very unusual in reinforcement learning models, but possibly very useful when the available actions change from frame to frame.
The agent learned to navigate to the beacon unless it was in a corner, as it would stop just in front of the beacon.
The code is available on GitHub.
The 1st place prize was a Google Home with Chromecast Ultra (4K).
GosuNet: Gamestate Evaluation and prediction (2nd place)
The prize for the second best project went to Lars Boe, Mathias Kirk Bonde, Kristian Jensen, Mike Preuss, Vanessa Volz.
“Predicting the outcome of a game based on an observable game state representation is an important first step towards AI players as it allows an evaluation of available actions at a given moment. Besides potential applications in-game analysis and betting, such a prediction model can also be used to evaluate a game or procedurally generated content, e.g. in terms of drama, and adapt it dynamically. We thus trained and tested several models (some of them incorporating expert knowledge) to predict game outcomes using two datasets randomly sampled from the replays released by Blizzard. As features, we used aggregated observations available in the pysc2 library, which include such values as food used, mineral collection rate and spent minerals. Given the game state at the last tick, we were able to achieve prediction accuracies on the test set between 80-90%. However, for game states earlier in the game, the accuracy drops to between 50-60%, which is similar to results obtained in previous work. Nevertheless, we are optimistic that fine-tuning our models and increasing the amount of collected data can improve this result.”
Code on Bitbucket.
The 2nd place prize was a Kingston HyperX Cloud Revolver Gaming Headset.
Clojure library for using SC2 AI API
Baruch Berger presented his library that implements a Clojure API that functions as a wrapper around the StarCraft II API. He is working on this project because Clojure is a “nice an elegant” programming language, I think he said. The nice thing about the StarCraft II API is the developers like Baruch can create wrappers in any language, and we should thus expect to see even more in the future.
During the workshop, Baruch finished the documentation in Jupyter notebook style with rendered video’s of the agent written in Clojure.
The library is available on GitHub.
Reinforcement Learning (DDPG) Move to Beacon (DTU + Microsoft)
This rather larger group of students from DTU and employees at Microsoft applied the reinforcement learning algorithm Deep Deterministic Policy Gradient (DDPG) to the Move to Beacon mini-game. The learned behavior was close to random, perhaps due to the short amount of training time.
StarCraft Bois A.K.A. Hast Hacking Hombres
This group created a custom mini-game, with one marine and one or two zerglings (I think), and attempted to train an agent to do the stutter-step (or kiting) micro-technique. They used the Open AI baselines, but they needed more time to train their model.
Game State Win Prediction from Non-spatial Features
A fully-connected neural network with 3 hidden layers of 800 units each was trained to predict the winner of a game. As far as I remember, the network was trained on the non-spatial feature such as unit count. After ~12-14 minute into a game, the network could predict the winner with 100% accuracy, which sounds almost too good to be true. We discussed how they achieved the results, but could not figure out why it reached 100%.
Maybe the game states where extracted, then scrambled, and finally split into a training and test set. This is not the right approach as states in the test set would be related to states in the training set. The correct approach is to either extract only one state from each game or to split the games in training and test set and the scramble the states.
Goal-based Build Order Optimization with Q-table
A build-order planner was trained with tabular Q-learning to optimize the production of Roaches for the Zerg player. Some participants suggested comparing the results with brute-force approaches. I don’t remember the name of the StarCraft II software that can do this, but something like Build Order Search System (BOSS), but for StarCraft II.
My own project
I tried to learn compressions of the feature layers using an autoencoder similar to the approach in this paper. An autoencoder is trained to reproduce the input but is constrained by a small hidden layer such that it needs to compress the data. I managed to get some preliminary results, but they have some defects, possibly due to the short training time, small data set and simple model used. The input has a shape of [7,60,60] and is compressed to [8,8,8] (2% of the input size).
Other groups that didn’t present
A wrapper on top of PySC2 for generalized building and unit training. The code is on GitHub.
Another project with some generalized code on top of PySC2. Their code is also available on GitHub.
We would like to thank everyone who participated in the workshop as well as our sponsors. Personally, I think we need more events where students and researchers interested in a specific topic come together to discuss, share, and collaborate on projects. We hope to follow up with a similar event next year.
Feel free to comment if you have questions, feedback or found a mistake somewhere.