## Join us at the Advanced Scientific Programming in Python summer school in Scotland

If you're interested in learning more about scientific programming in Python, and want to compete in a Pacman capture-the-flag tournament, join us for the next Advanced Scientific Programming in Python summer school in St. Andrew, UK!

The faculty line-up includes developers of well-known scientific libraries for Python (e.g., Francesc Alted of PyTables fame). The program covers advanced topics in numerical programming (advanced NumPy, Cython, parallel applications, data serialization, ...) and modern techniques for writing robust, efficient code (agile programming, test driven development, version control, optimization techniques, ...). Most of all, the school is a great opportunity to meet like-minded people and have fun writing Python code together! Participation is free of charge, and you can apply online.

You can see a few picture from previous summer schools on our Facebook group.

## Tracking down the enemy (2)

I never got the chance to show a working agent based on the Bayesian estimator for the enemy position in the PacMan capture-the-flag game. In the previous PacMan post, I wrote about merging a model of agent movements with the noisy measurements returned by the game to track the enemy agents across the maze. Clearly, this information can give you an edge when planning an attack (to avoid ghosts) or when defending (to intercept the PacMen).

For the traditional faculty-vs-students tournament at the G-Node scientific programming summer school this year, I wrote a PacMan team made by a simple attacker, and a more sophisticated defender that tries to intercept and devour enemy agents.

Both agents plan their movements using a shortest-path algorithm on a weighted graph: before the start of the game, the maze is transformed in a graph, where nodes are the maze tiles, and edges connect adjacent tiles. Weights along the edges are adjusted according to the estimated position of the agents:

- Weights on edges close to an enemy ghost are increased (starting value is proportional to the probability of the enemy being there, and falls off exponentially with distance)
- Weights on edges close to an enemy PacMan are decreased
- Weights on edges close to a friendly agent are increased

An agent navigating on such a maze will tend to avoid ghosts, chase PacMen, and cover parts of the maze far from other friendly agents. My attacker does little else than updating the weights of the graph at every turn, and move toward the closest food dot.

On the other hand, defending is quite difficult in this game, so I needed a more sophisticated strategy. While the enemy is still a ghost in its own part of the maze, the defender moves toward the closest enemy agent (its estimated position, that is). When the enemy is a PacMan in the friendly half, the chase is on! Since ghosts and PacMen move at the same speed, it would be pointless to just follow it around, one needs to catch them from the front... Once more, the solution was to modify the weights of the maze graph, making weights behind the enemy (i.e., opposite to its direction of motion) very high, and lowering the edges in front of it.

The combination of estimator and the weighted graph strategy can be quite entertaining:

Sometimes the defender only needs to guard the border to scare the opponent shitless:

Another useful thing to keep in mind for the future: it is better to base strategies on soft constraints (weighted graphs, probabilities). Setting hard, deterministic rules tends to get you stuck in loops. Soft constraints and some randomness give you more flexibility when you need it but are otherwise just as good.

## Tracking down the enemy

As another scientific Python course is approaching, I've been brushing up my PacMan skills. I decided to give a try to a strategy I had been thinking on, which relies upon having a good estimate of the enemy's position. I should remind the reader that in the PacMan capture-the-flag game, one team does not know the exact position of the other agents unless they are within 5 squares of one's own agents. The game does, however, return a rough estimate of the opponent's distance. Our agents-tracker will thus have to blindly make its best guess, and keep a probability distribution over possible positions.

To estimate the position of the opponent agents we need to apply some probability theory:

P(x(t)) = sum_{x(t-1)} P( x(t-1) ) P( x(t) | x(t-1) )

or, in other words, the probability that the agent is at position x(t) at time t is equal to the sum of the probability of it being at x(t-1) at time t-1, times the probability of transitioning from x(t-1) to x(t). The first term is given simply by the previous estimate, while the second term is our model of the behavior of our opponent (*).

For example, a very conservative model could assume that the opponent could take any legal move at random:

P( x(t) | x(t-1) ) = 1/N

if x(t-1)->x(t) is a legal move, where N is the total number of legal moves from x(t-1), and

P( x(t) | x(t-1) ) = 0

otherwise. This video shows how such a model performs when the opponent behaves exactly as assumed; the red agent, in the bottom left corner, is estimating the position of the blue agent in the opposite corner; the area of the red squares is proportional to the probability P(x(t)):

The tracker is doing a good job in this case, but fails miserably for a more realistic opponent:

We clearly need to improve the opponent's model... luckily another simple model results in a large improvement: we can safely assume that the opponent tends to explore new parts of the maze in search of food. We can formalize this as

P( x(t) | x(t-1) ) = 1/Z exp(-beta * v(x(t))

if x(t-1)->x(t) is a legal move, and 0 otherwise. v(x) is the number of times the agent visited x in the past, and beta is a constant that controls the how exploratory the opponent is. When beta=0, the model is equivalent to the previous random model. Z is a normalizing constant such that P( x(t) | x(t-1) ) sums to 1.

Let's see how this model does in practice (in the video, beta = 10):

Much better, isn't it? We can do even better by using two other sources of information: first, the game gives us a noisy estimation of the distance of the opponent (actual distance +/- 6); second, we know that if the opponent is not visible, it must be at least 5 squares away. We can take this information into account by setting P(x(t)) to 0 for squares that lie outside the noisy distance range, and for those inside the visibility range.

The last video shows the complete tracker at work. The blue lines show the area in which the agent might be according to the noisy distance, and the green line shows the visibility range:

The biggest area for improvement here is the agent's model P(x(t)|x(t-1)). One possibility could be to simulate several common strategies, and to use the transition statistics for the simulated agents to estimate that probability...

Now, can we use the estimated position to program better AI agents? I'll give it a try, and report back soon!

(*) Strictly speaking, we are doing "filtering" here, i.e., we're estimating the current position assuming the past inferences are fixed. The alternative is to do "smoothing", where the full joint probability P(x(t), ..., x(1)) is estimated at each step. The information coming from the new observation is propagated back and forth at each to improve the past inferences. For example, knowing that the agent is at a given position at time t might exclude another position at time t-2 because of too large a distance, which in turn could improve the estimate at time t.

## PacMan capture-the-flag: a fun game for artificial intelligence development and education

At the beginning of September I've been invited to teach at a summer school about scientific programming. The whole experience has been really rewarding, but it was the student's project that got me going: we had the students write artificial intelligence algorithms for the agents of a PacMan-like game, and organized a tournament for them to compete against each other.

The **PacMan capture-the-flag game** has been written originally by John DeNero, and has been used to teach an artificial intelligence course by him at Berkley and by Hal Daume III at University of Utah. Very often, this kind of games have a single strategy that dominates all others, and once you find it the interest fizzles out. In this case, I was impressed by how rich this game is. The game offers a lot of opportunities to develop and test complex learning and planning algorithms, including cooperation strategies for games with multiple agents.

The rules of the game are quite simple: the board is a PacMan maze, divided in a red and a blue half. The two halves belong to two teams of agents, who are controlled by computer programs to eat the opponent's food and protect their own. When in the opponent's half, the agents are **PacMan** (PacMen?), while in their own half, the agents are **ghosts** and can kill the opponent's PacMan agents, in which case these are returned to their initial position. The players get one point for each food dot they eat; no points are assigned for eating the other team's agents. The game ends when one of the two teams eats all of the opponent's food, or after 3000 moves; the team with the highest score wins.

To make the game more interesting, one can only observe the position of the other team's agents when they are very close to one'w own agents (5 squares away); otherwise, one can only obtain a noisy estimate of their distance.

The game is written in Python, my programming language of choice, which allows to write rapidly even sophisticated algorithms. I recommend the game to anyone wanting to organize an artificial intelligence course, or simply have fun writing AI agents. I plan to dedicate a couple of posts to the basic strategies to write successful agents in this game.

Here's a video of the best students' agents (red team) playing against the best tutors' agents (blue team). The tutors won, saving our reputation!

**Update:** The authors of the PacMan capture-the-flag game decided to keep the game close-source, and in particular would prefer not to publish the code of agents playing their game, fearing that it might interfere with their course. It's a shame because I was planning to write some Genetic Programming agents for the game, but of course I respect their decision. I guess there will be no series of posts re:PacMan...