July 29, 2010

Datastage vs Pentaho popularity

Over the last few months we've been working hard on a rip and replace project to migrate a customer from IBM Datastage to Pentaho Data Integration. Hard work, but it was interesting to see that it can be done, even with a business case that shows payback within a year. (More about that later.)

Anyhow, having found a customer that wants to leave behind Datastage (a solid tool that I've used on multiple projects in the past) to revert to an open source alternative as Pentaho Data Integration (which continues to gain "followers"), this project made we wonder about how both tools compare in popularity. Roland Bouman wrote a blog-post a few days ago comparing the Oracle to MySQL (as well as a few other databases) using Google Trends. I did the same thing and that turned up these results.

It would seem that Pentaho Data Integration (or rather Pentaho as a whole, because PDI isn't really marketed separately) has overpassed Datastage search volumes somewhere late 2007, beginning 2008. Actually much earlier than I would have thought.

Since the result surprised me, I went back to Roland's blog and checked out the comments. Many people suggested that there were better statistics.

So I checked Google Insight with the following query:

The results were even more outspoken:

Finally I checked out StackExchange, (even including Informatica this time) again with striking results on the popularity of Pentaho.

I guess kJube has been on track with the new trends, doing all those Pentaho projects over the last 5 years.