At the last APSP python school, I showed the entry about the "X for Y developers" book data to Stefan van der Walt, and he immediately went "we can predict the programming language of the fourth millennium". Now, that's what I call thinking!
The idea is to interpret the numbers on each column of the X for Y table as the probability for a developer to transition from programming language Y to programming language X, and then project into the future to find the final, stationary distribution of programming languages. Those are the languages we're going to use on Mars!
We need to take care of three things:
1) Remove the diagonal entries ("X for X developers"), which are entries for books like "Enterprise Java for Java developers", and are just noise for the purpose of this analysis;
2) Replace "dangling nodes", i.e., dead ends in the transition graph. There are none in this case, but in general, if there is no entry to transition from language Y to any other language, we add a constant value to all entries of the column. This means that developers of that language have a small probability of choosing another language at random;
3) Give a constant probability for each entry on the diagonal, corresponding to the probability of a developer to continue using the same language. Here I used P=0.9, but the final result does not depend on this parameter.
It turns out this is the same algorithm used by Google to rank web pages, a.k.a the PageRank algorithm.
So what is the Language of the Next Millennium?
By reversing the order of the edges in the transition graph we can answer another questions: from which language Y did the developers for X come from? This is equivalent of transposing the table, and run the PageRank algorithm again.
Let the PageRank time machine take us back to languages used at The Origins:
Ah, the good old times... Somebody hand be a punch card!
[As usual, the code for this post is available on the github repository.]
Given the proliferation of "X for Y developers" books, I decided to collect some information to help me find a niche for my own book!
This table shows the number of combined Google hits for "X for Y developers" and "X for Y programmers", with a selection of Xs and Ys (e.g., the first rows shows the hits for "BASIC for BASIC developers", "BASIC for C++ developers", etc.):
It looks like we're missing a good "Cobol for BASIC programmers" book: I'll start typing right away!
Note that the matrix is highly asymmetric: apparently, Perl programmers are more interested to switch to Python than vice versa. We can turn the entries of the matrix into weights in a directed graph, that can be loosely interpreted as the relative number of people that would like to switch from a programming language to another. Below you can see the graphs for six of the languages (the size of the arrows correspond to one of 6 bins of equal size on a logarithmic scale; no arrow means zero results):
In case you were wondering: yes, there really is a successful "Java for Cobol programmers" book!
Update 1 Apr 2012: Added values for Fortran to the table.