Nati Shalom's post, Why most large-scale web sites not written in Java, created a bit of a stir, to say the least, with a raging debate in the comments to the blog and one on TheServerSide and Artima.
And it also got referenced and commented on here, here, and here - and several others.
Some people interpreted Nati's question as Java-bashing. And that created a silly PHP/Ruby Vs. Java flame war. As I had something to do with this post, as Nati mentioned, I thought that a couple of clarifications are in order.
As anybody who knows Nati and GigaSpaces (and as Guy Nirpaz pointed out), we're Java people at heart. Our product is written in Java. So there was no intention to bad-mouth it in any way.
I agree that the way the title of the blog was worded was a bit provocative, implying that most large scale web apps are not written in Java, but that was merely done to make things more interesting and based on an arbitrary analysis done by Pingdom, based on information they simply aggregated from High-Scalability. So obviously, no, it's not a statistically significant sample, and the word "most" is problematic.
Then, of course, some people rightly asked what is the definition of "large-scale". Again, it's a legitimate question.
But I want to explain what was the motivation for this post. We had an internal discussion on the relevance of GigaSpaces to the world of Web applications. At GigaSpaces (and yes, some of our competitors too - no need to shout) we've been pounding the scale-out drums, the partitioned data drums, the shared-nothing drums (and many other kinds of percussion instruments) for quite some time. In the last year or two it seems that a lot of people -- especially from the Web world -- are joining that funky beat. Google, eBay, Amazon, MySpace, Flickr, YouTube, Wikipedia and many others have published their architectures (and yes, I know some of them use Java) - and spoke of such architectures.
In other words, in the architectures that these folks used to make their Web apps highly scalable, performant and reliable, they are using very similar principles to the ones GigaSpaces has been advocating.
That said, we noticed that many of these folks are going with LAMP stack. The question was whether this is the trend, and consequently, what does it mean for us. If the trend is that Web 2.0 is moving to LAMP, but the enterprise market is still going to be a Java world, we need to make some decisions. If the whole world is moving to LAMP, we need to make some other decisions. And if Java is still dominant (or at least major) in the Web-world, it means yet other decisions and implications.
Although we don't have clear statistics, my intuition is that the trend for Web apps that are coming from start-ups or large pure web players (as opposed to web apps from airlines, banks, etc.) is definitely towards the LAMP stack. However, Java will still remain strong for a while and especially for Web apps that have to deal with more complex processes in the back-end.
[BTW, I have a theory that 2004 was a watershed year in which many trends changed. It is the year that Spring Framework came out, that the term Web 2.0 was officially coined, that Ruby on Rails was released to the public, and several other interesting events. It would be interesting to see an analysis that compares what Web companies that are pre-2004 and post-2004 use for their infrastructure.]
Another interesting aspect, which Nati raises in his latest post, is that there is a possible convergence between Java and LAMP/RoR in various shapes and forms -- so it is not necessarily a zero-sum game.