February 19, 2010

just when I got used to things being open and free

Just recently I needed an ETL tool for a customer to do a quick proof of concept. And when I need an ETL at 0 cost, which I can install in less that 5 minutes, Pentaho Data Integration (PDI) is surely my choice. Honesty, of course, bids me to day that if I would have half a million euro to spare as well as some months of deployment time, Pentaho Data Integration would still be the ETL tool of my choice. I was converted over 5 years ago, still waiting for better tools to come around.

So I have used to the nice fact that PDI ships with a load of JDBC drivers allowing you to connect to whatever database you might or might not want to run into:
  • IBM DB2 AS400
  • Apache Derby
  • Borland Interbase
  • ExtenDB
  • Firebird SQL
  • Greenplum
  • Gupta SQL
  • H2
  • Hypersonic
  • IBM DB2
  • Infobright
  • Informix
  • Ingres
  • Intersystems Cache
  • Kingbase ES
  • Lucid DB
  • MS Access
  • MS SQL Server
  • Max DB
  • MonetDB
  • MySQL
  • Neoview
  • Netezza
  • Oracle
  • Oracle RDB
  • PALO MOLAP server
  • PostgreSQL
  • Remedy 
  • SAP R/3
  • SQLite
  • Sybase
  • SybaseIQ
  • Teradata
  • UniVerse DB
  • Vertica
  • dBase

For those ofyou, not too familiar  with the ETL market, I just want to point out that the commercial ETL vendors have made 'connectivity' their personal cash cow over the last decennium. Looking at Informatica standard edition for instance, you might notice the little footnote reference next to the words 'Get access'. Yep that cute little floating 1 which translates at the bottom of the page into 'When complemented by Informatica PowerExchange'.  How many customers do they have that by Informatica PowerCenter standard edition for a figure with 5 zero's, without the need to 'Get access' to data? Euh, does not compute ......

But back to the topic. So, I'm a happy data warehouser, able to get all connectivity I want , for free. That is I was until now. I unzipped kettle and wanted to connect to DB2 on zOS and the following horrible message showed up:

The version of the IBM Universal JDBC driver in use is not licensed for connectivity to QDB2/ databases. To connect to this DB2 server, please obtain a licensed copy of the IBM DB2 Universal Driver for JDBC and SQLJ.

Bummer. It seems that the JDBC drivers delivered by IBM require a license file.

As of DB2 UDB v8.1.2 the Universal JDBC driver requires a license JAR file to be in the CLASSPATH along with the db2jcc.jar file. Here are the names of the required license JAR files:
  • For CloudscapeTM Network Server V5.1: db2jcc_license_c.jar
  • For DB2 UDB V8 for Linux, UNIX, and Windows servers: db2jcc_license_cu.jar
  • For DB2 UDB for iSeries® and z/OS servers (provided with DB2 Connect and DB2 Enterprise Server Edition): db2jcc_license_cisuz.jar

Of course I cannot blame PDI for the functioning of IBM's JDBC drivers. Obviously also Big Blue has it's cash cows, being in this case the expensive mainframe platform that many a customer cannot get rid off. And so these customers can couch up the $$$ and I can forget about using PDI for a quick proof of concept.

Just when I got used to everything just everything being available when I needed it.