At the last APSP python school, I showed the entry about the "X for Y developers" book data to Stefan van der Walt, and he immediately went "we can predict the programming language of the fourth millennium". Now, that's what I call thinking!
The idea is to interpret the numbers on each column of the X for Y table as the probability for a developer to transition from programming language Y to programming language X, and then project into the future to find the final, stationary distribution of programming languages. Those are the languages we're going to use on Mars!
We need to take care of three things:
1) Remove the diagonal entries ("X for X developers"), which are entries for books like "Enterprise Java for Java developers", and are just noise for the purpose of this analysis;
2) Replace "dangling nodes", i.e., dead ends in the transition graph. There are none in this case, but in general, if there is no entry to transition from language Y to any other language, we add a constant value to all entries of the column. This means that developers of that language have a small probability of choosing another language at random;
3) Give a constant probability for each entry on the diagonal, corresponding to the probability of a developer to continue using the same language. Here I used P=0.9, but the final result does not depend on this parameter.
It turns out this is the same algorithm used by Google to rank web pages, a.k.a the PageRank algorithm.
So what is the Language of the Next Millennium?
By reversing the order of the edges in the transition graph we can answer another questions: from which language Y did the developers for X come from? This is equivalent of transposing the table, and run the PageRank algorithm again.
Let the PageRank time machine take us back to languages used at The Origins:
Ah, the good old times... Somebody hand be a punch card!
[As usual, the code for this post is available on the github repository.]
Given the proliferation of "X for Y developers" books, I decided to collect some information to help me find a niche for my own book!
This table shows the number of combined Google hits for "X for Y developers" and "X for Y programmers", with a selection of Xs and Ys (e.g., the first rows shows the hits for "BASIC for BASIC developers", "BASIC for C++ developers", etc.):
It looks like we're missing a good "Cobol for BASIC programmers" book: I'll start typing right away!
Note that the matrix is highly asymmetric: apparently, Perl programmers are more interested to switch to Python than vice versa. We can turn the entries of the matrix into weights in a directed graph, that can be loosely interpreted as the relative number of people that would like to switch from a programming language to another. Below you can see the graphs for six of the languages (the size of the arrows correspond to one of 6 bins of equal size on a logarithmic scale; no arrow means zero results):
In case you were wondering: yes, there really is a successful "Java for Cobol programmers" book!
Update 1 Apr 2012: Added values for Fortran to the table.
The concept of "open-ended evolution" in artificial life is mostly misunderstood. In many cases, "open-ended" is taken to mean that the final solution can take any shape, however complicated. Instead, a stronger definition of "open-ended" would require that the fitness function itself is changing. For example, Genetic Programming would be defined as open-ended in the first sense, since the evolved programs can solve and computable problem (provided that the instruction set is Turing-complete), but is not in the second sense, as the programs are selected according to one, immutable fitness function. This is more akin to adaptation than evolution.
In my opinion, an important but neglected goal of ALife should be that of identifying a minimal set of conditions that is able to sustain this stronger kind of open-ended evolution.
What would it take for an artificial system to display "real" open-ended evolution? It is possible that there may be multiple answer to this question, but co-evolution of agents seems to be a good candidate. A minimal example might involve populations of agents trying to predict each other's output. The output of a population should be tied to the process of predicting the response of the others. To initiate a runaway evolution process, predicting a simple output should require a simple algorithm, which however would produce a slightly more complex output by the laws of the artificial world. In such a scenario, the fitness function is given by the combination of prediction algorithms in the population, and each evolutionary step changes it in a way that requires increasingly complex behaviors.
It's that time of the year again... the word spreads, I download it, and my nights fly by. It's time for the new Google AI Challenge: http://aichallenge.org/ ! This time we will be writing multi-agent systems, as we program cyber-ant-colonies fighting against each other for bread crumbs.
I haven't tried the game on Safari or Opera, please let me know if it works!
Some computer games offer the possibility to extend their Artificial Intelligence with external scripts, or are explicitly designed to be played by bots. Such games are a great resource to develop, test, and teach AI algorithms.
I have been looking for a list of this kind of games, but could find very little information, often outdated. This is my own, annotated list; let me know if I missed something, and I'll make amend. I included only decently active projects:
Games for bots:
- Robocode (Java, .NET): AI bots control tanks equipped with cannon and radar on an square arena. A game with a long tradition, descending from the classic AI games RobotWar (1981) and crobots (1985). This is a very active project, with tournaments organized regularly all over the world. The strategical possibilities are quite limited, though.
- Tron challenge (any language): This is the first AI competition organized by the University of Waterloo Computer Science Club and sponsored by Google. The AIs control the "snakes" in a variant of the popular game Snake (or Nibbles, of QBasic fame) on a number of boards with symmetric walls. Although the competition is closed, the code is open source and perfectly usable. A nice game, but there seems to be one dominating strategy: using minimax. Entries in the competition differ in their cleverness in coming up with various approximations to prune down the huge tree of possible moves, and in handling the endgame.
- Planet Wars challenge (any language): Second AI competition, again organized by the University of Waterloo CS Club. The inspiration is the galactic conquer game Galcon. Bots controls fleets of spaceships to control planets for resources. Unlike the first competition, the optimal strategy is far from obvious, and the game offers a lot of space for experimenting with different approaches. Not as visually fun as other games, though.
- PacMan capture the flag (Python): The CS department at Berkeley offers a very popular Artificial Intelligence course that requires students to code up classical AI algorithms to control agents in a PacMan-like environment. The course concludes with a tournament in which PacMan agents compete to eat each other's dots. The game is extremely fun as I wrote about in previous posts. The code is available online (although somewhat messy). Unfortunately (but understandably, since the game is used in a course), the authors don't want the code of developed agents to leak online.
- Grid-Soccer (Mono, .NET): This is a multi-agent soccer-like game played on a grid board. I haven't tried this one yet... According to the authors, it was originally conceived as a testbed for multi-agent reinforcement learning algorithms.
Scriptable games (games that expose an interface for external clients):
- Starcraft Brood War (any language): This is your one chance at writing AI bots for a commercial game! Starcraft has been hacked to allow to control any aspect of the game through a user-written client (building construction, units movement, the full monty). Since 2010 there have been a few AI competitions open to the public (two of them still accepting entries!). Here's a video from last year's AIIDE tournament:
- XBlast (potentially any): A free version of Bomberman. The game has not been used to develop AI agents... yet. It is built with a server-client architecture and it's begging to get bots running on it.
Natural selection is the algorithm Nature uses to maximize the probability of reproduction of organisms. With Genetic Algorithms (GAs), engineers imitate this process in order to optimize a set of parameters in an engineering problem... or a dune buggy! The great application at boxcar2d.com uses simulated physics to evolve 2D cars that are optimally fast and stable.
GAs tie the ability to solve a problem to the likelihood of reproduction of a set of parameters: first, one creates a population of possible solutions to a problem, evaluates the solutions, and then forms a new generation by allowing the most successful ones to pass their parameters to the next generation, after small mutations and parameters swapping.
My general feeling about GAs is that in most cases other optimization algorithms will give similar or better results with less evaluations of the fitness function (which is typically the bottleneck). In Chapter 30.2 of his classic book on information theory, David McKay gives a very interesting interpretation of GAs as a Monte Carlo sampling in the space of parameters, and discusses the relation with efficient sampling methods, of which we have a better formal understanding.
Perhaps inspired by the victory of the artificial brain Watson against humanity, the New York Times is offering today an interactive feature that allows to play a series of Rock-Paper-Scissors games against the computer.
The prediction algorithm seems to be a simple Bayesian estimator like the one I implemented for the Karate A.I. game. The interesting twist is that in "Veteran mode" the AI will make use its interaction with previous users to define a prior over possible move sequences. It surely works against me...
If you're interested in learning more about scientific programming in Python, and want to compete in a Pacman capture-the-flag tournament, join us for the next Advanced Scientific Programming in Python summer school in St. Andrew, UK!
The faculty line-up includes developers of well-known scientific libraries for Python (e.g., Francesc Alted of PyTables fame). The program covers advanced topics in numerical programming (advanced NumPy, Cython, parallel applications, data serialization, ...) and modern techniques for writing robust, efficient code (agile programming, test driven development, version control, optimization techniques, ...). Most of all, the school is a great opportunity to meet like-minded people and have fun writing Python code together! Participation is free of charge, and you can apply online.
You can see a few picture from previous summer schools on our Facebook group.
I never got the chance to show a working agent based on the Bayesian estimator for the enemy position in the PacMan capture-the-flag game. In the previous PacMan post, I wrote about merging a model of agent movements with the noisy measurements returned by the game to track the enemy agents across the maze. Clearly, this information can give you an edge when planning an attack (to avoid ghosts) or when defending (to intercept the PacMen).
For the traditional faculty-vs-students tournament at the G-Node scientific programming summer school this year, I wrote a PacMan team made by a simple attacker, and a more sophisticated defender that tries to intercept and devour enemy agents.
Both agents plan their movements using a shortest-path algorithm on a weighted graph: before the start of the game, the maze is transformed in a graph, where nodes are the maze tiles, and edges connect adjacent tiles. Weights along the edges are adjusted according to the estimated position of the agents:
- Weights on edges close to an enemy ghost are increased (starting value is proportional to the probability of the enemy being there, and falls off exponentially with distance)
- Weights on edges close to an enemy PacMan are decreased
- Weights on edges close to a friendly agent are increased
An agent navigating on such a maze will tend to avoid ghosts, chase PacMen, and cover parts of the maze far from other friendly agents. My attacker does little else than updating the weights of the graph at every turn, and move toward the closest food dot.
On the other hand, defending is quite difficult in this game, so I needed a more sophisticated strategy. While the enemy is still a ghost in its own part of the maze, the defender moves toward the closest enemy agent (its estimated position, that is). When the enemy is a PacMan in the friendly half, the chase is on! Since ghosts and PacMen move at the same speed, it would be pointless to just follow it around, one needs to catch them from the front... Once more, the solution was to modify the weights of the maze graph, making weights behind the enemy (i.e., opposite to its direction of motion) very high, and lowering the edges in front of it.
The combination of estimator and the weighted graph strategy can be quite entertaining:
Sometimes the defender only needs to guard the border to scare the opponent shitless:
Another useful thing to keep in mind for the future: it is better to base strategies on soft constraints (weighted graphs, probabilities). Setting hard, deterministic rules tends to get you stuck in loops. Soft constraints and some randomness give you more flexibility when you need it but are otherwise just as good.