<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-22251486</id><updated>2012-02-16T20:13:56.089+01:00</updated><category term='Database development'/><category term='Data Modelling'/><category term='Book'/><category term='Business Intelligence - Pentaho'/><category term='Data Integration - Kettle'/><category term='Fun and fail'/><category term='Data Integration'/><category term='Business Intelligence'/><category term='Data Integration - KFF'/><title type='text'>kJube - fair and square</title><subtitle type='html'>Loose thoughts on business intelligence and data integration, with a special thought for open source.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>75</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-22251486.post-9001849345558830572</id><published>2012-02-07T11:42:00.001+01:00</published><updated>2012-02-07T13:23:56.786+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>Pentaho Report Designer with dynamic queries</title><content type='html'>&lt;div style="text-align: left;"&gt;Already a year old, but still a cool tutorial.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&amp;nbsp;&lt;object data="http://www.techievideos.com/p/" height="450" id="techievideos-player" name="techievideos-player" type="application/x-shockwave-flash" width="600"&gt;                 &lt;param name="movie" value="http://www.techievideos.com/p/" /&gt;                 &lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;                 &lt;param name="flashvars" value="config=http://www.techievideos.com/p/c/y3zQ7J5azGzYGI6mJjSU/" /&gt;                 &lt;/object&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-9001849345558830572?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/9001849345558830572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2012/02/already-year-old-but-still-cool.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/9001849345558830572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/9001849345558830572'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2012/02/already-year-old-but-still-cool.html' title='Pentaho Report Designer with dynamic queries'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-4633809013293558229</id><published>2011-09-24T10:52:00.001+02:00</published><updated>2011-09-26T12:22:32.139+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>Pentaho Community Gathering (Live)</title><content type='html'>This year PCG11 is happening in Frascati, just outside Rome. No &lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ3OBbbQbYI/AAAAAAAAAbg/mLaWVrO-MUI/s1600/IMAG0645.jpg"&gt;sea-side&lt;/a&gt; this year, but the famous hills of Rome. As last year, I'll try to do a &lt;span id="goog_1589414989"&gt;&lt;/span&gt;live write-out, as things happen. (The first two presentations will be posted with some delay as the hotel wireless was down in the morning.)&lt;br /&gt;&lt;br /&gt;As during the prevous years, we have a full house of Pentaho enthusiasts ...&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-9YY6V6IQMi8/Tn2ZQXyrrfI/AAAAAAAAA9E/7QtvaJtZtlE/s1600/IMAG1359.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://1.bp.blogspot.com/-9YY6V6IQMi8/Tn2ZQXyrrfI/AAAAAAAAA9E/7QtvaJtZtlE/s400/IMAG1359.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-dBlIjPI5VT8/Tn3aXaoJzJI/AAAAAAAAA-g/62bn00-HBq0/s1600/Screen+Shot+2011-09-24+at+15.25.13.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-dBlIjPI5VT8/Tn3aXaoJzJI/AAAAAAAAA-g/62bn00-HBq0/s1600/Screen+Shot+2011-09-24+at+15.25.13.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-UftTjFcyW8M/Tn2ZZcb6kGI/AAAAAAAAA9I/xhYrWo_SbMw/s1600/IMAG1363.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://3.bp.blogspot.com/-UftTjFcyW8M/Tn2ZZcb6kGI/AAAAAAAAA9I/xhYrWo_SbMw/s400/IMAG1363.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;9h45 Doug Moran&lt;/b&gt;&lt;br /&gt;As usual, Doug kicked-off, with a short introduction of the event. No pictures of that as I was still complaining to the hotel staff about the failing wireless at that time.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-BdTJYzE3EmY/Tn3bB6HRtNI/AAAAAAAAA-o/U2yOK2uYCdY/s1600/Screen+Shot+2011-09-24+at+15.28.14.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-BdTJYzE3EmY/Tn3bB6HRtNI/AAAAAAAAA-o/U2yOK2uYCdY/s1600/Screen+Shot+2011-09-24+at+15.28.14.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;10h Matt Casters&lt;/b&gt;&lt;br /&gt;Matt gives an update of some of the new kettle stuff:&lt;br /&gt;- DataCleaner profiling integration&lt;br /&gt;- kettle JDBC driver ported over to Kettle trunk, in use in the DataCleaner integration&lt;br /&gt;- KFF into PDI 4.3: first overview of the plans to move technical parts of KFF into Kettle trunk&lt;br /&gt;- An overview of what happened with dynamic ETL Metadata injection&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-xgPAyLtXFF4/Tn2Y4Ppbo3I/AAAAAAAAA9A/umaMES20STs/s1600/IMAG1360.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://1.bp.blogspot.com/-xgPAyLtXFF4/Tn2Y4Ppbo3I/AAAAAAAAA9A/umaMES20STs/s400/IMAG1360.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;He does remind the audience and speakers that there is an unwritten PCG rule, that all presentations should be done without the use of Powerpoint, something that last year wasn't really respected. However, this will not stop me from putting up any Power Points people should have made!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-Vf7l-idz64w/Tn4Ci4r1adI/AAAAAAAAA_s/RGhfBAi1JBg/s1600/Screen+Shot+2011-09-24+at+18.16.40.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-Vf7l-idz64w/Tn4Ci4r1adI/AAAAAAAAA_s/RGhfBAi1JBg/s1600/Screen+Shot+2011-09-24+at+18.16.40.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;10h 15 Pedro Alves&lt;/b&gt;&lt;br /&gt;Pedro gives a live demo of how CDF dashboard parts can easily be added on a Pentaho report. This is done through custom plugins for PRD/BI server that allow directly accessing CDA and CDF components.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-T-s1n4I9OBM/Tn2aUC65RNI/AAAAAAAAA9M/40z_X3DUCEM/s1600/IMAG1365.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://3.bp.blogspot.com/-T-s1n4I9OBM/Tn2aUC65RNI/AAAAAAAAA9M/40z_X3DUCEM/s400/IMAG1365.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;10h40 Alain Debecker&lt;/b&gt;&lt;br /&gt;Alain explains how he joined linalis (Geneva) to come back to Pentaho and why he is talking about security. He started the development of a UI to manage Mondrian role grants of a complex customer. He also explained how Kettle can be used to load a set of users from an employee database. Finally, he promises a white paper soon to be published on this subject.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;11h00 Coffee break&lt;/b&gt;&lt;br /&gt;First presentations are over and the crowd (mostly with serious hangovers from the day before) are craving for coffee. And while the Italian organization hasn't proved the best a setting up the wireless, the coffee table looks quite good.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-E2Mv_Q4iYnw/Tn2iG9p72rI/AAAAAAAAA9Q/9jM5ztH6bIQ/s1600/IMAG1372.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://1.bp.blogspot.com/-E2Mv_Q4iYnw/Tn2iG9p72rI/AAAAAAAAA9Q/9jM5ztH6bIQ/s400/IMAG1372.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-y_KEQ6tyEj8/Tn2iSZhOdeI/AAAAAAAAA9U/smBSstCJG-E/s1600/IMAG1370.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://3.bp.blogspot.com/-y_KEQ6tyEj8/Tn2iSZhOdeI/AAAAAAAAA9U/smBSstCJG-E/s400/IMAG1370.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-sFffuBP8NlI/Tn2iZ5aR8kI/AAAAAAAAA9Y/K8bY5AM9QJY/s1600/IMAG1374.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://2.bp.blogspot.com/-sFffuBP8NlI/Tn2iZ5aR8kI/AAAAAAAAA9Y/K8bY5AM9QJY/s400/IMAG1374.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-LwLdLwMdcQw/Tn3ZcVCeZ0I/AAAAAAAAA-Q/W4tgyu9yW58/s1600/Screen+Shot+2011-09-24+at+15.21.21.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-LwLdLwMdcQw/Tn3ZcVCeZ0I/AAAAAAAAA-Q/W4tgyu9yW58/s1600/Screen+Shot+2011-09-24+at+15.21.21.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;11h15&amp;nbsp;Roland Bouman&lt;/b&gt; (&lt;a href="https://docs.google.com/leaf?id=0B7pEch_luF0xYzY2NjA5NWItYWQ5Zi00MjhlLTg5NGEtNTdlYmIzNzA2ZDI5&amp;amp;hl=en_GB"&gt;presentation&lt;/a&gt;)&lt;br /&gt;Roland walks us through the concepts of&amp;nbsp;&lt;b&gt;&lt;a href="http://code.google.com/p/xmla4js/"&gt;XMLA4JS&lt;/a&gt;&lt;/b&gt; his&amp;nbsp;the wrapper around XMLA which allows accessing XMLA through Javascript. Since PCG11 he's been working with people from Palo to extend this. He's has also collaborated with Andy Grohe from Pentaho to put an object model on top of XMLA4JS that would make usage of XMLA4JS easier.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-NwMElw48Ap0/Tn3ai2UOHUI/AAAAAAAAA-k/7LRD1RCbHho/s1600/Screen+Shot+2011-09-24+at+15.26.13.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-NwMElw48Ap0/Tn3ai2UOHUI/AAAAAAAAA-k/7LRD1RCbHho/s1600/Screen+Shot+2011-09-24+at+15.26.13.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Roland explains the progress that has been made on&amp;nbsp;&lt;b&gt;&lt;a href="http://code.google.com/p/kettle-cookbook/"&gt;kettle Cookbook&lt;/a&gt;&lt;/b&gt;&amp;nbsp;since last PCG (and demo's it):&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Nicer diagrams (thanks to Slawo),&amp;nbsp;&lt;/li&gt;&lt;li&gt;syntax highlighting,&amp;nbsp;&lt;/li&gt;&lt;li&gt;TOC,&amp;nbsp;&lt;/li&gt;&lt;li&gt;Incremental generation,&amp;nbsp;&lt;/li&gt;&lt;li&gt;Saxon configurable,&amp;nbsp;&lt;/li&gt;&lt;li&gt;xaction documentation and M&lt;/li&gt;&lt;li&gt;Mondrian schema documentation.&lt;/li&gt;&lt;/ul&gt;Especially mondrian schema documentation with an overview of (shared) dimensions and facts with a visualization of the star schema is looking very nice.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-yQEvllBFMEg/Tn2k9_oaEgI/AAAAAAAAA9c/iV4kk3P5EKw/s1600/IMAG1379.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://4.bp.blogspot.com/-yQEvllBFMEg/Tn2k9_oaEgI/AAAAAAAAA9c/iV4kk3P5EKw/s400/IMAG1379.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Ongoing work on Cookbook:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;kettle repository support&lt;/li&gt;&lt;li&gt;KREX (Kettle repository export)&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Ideas for the future:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Using the kettle auto documentation step?&lt;/li&gt;&lt;li&gt;Data lineage?&lt;/li&gt;&lt;li&gt;xaction flowcharts&lt;/li&gt;&lt;li&gt;prpt report documentation&lt;/li&gt;&lt;li&gt;diagrams for SQL queries&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;a href="http://code.google.com/p/pendular/"&gt;Pendular&lt;/a&gt;&lt;/b&gt; is Roland's latest hobby project. The idea behind Pendular is to provide a Pentaho interfaces that is better suited for mobile devices. Also here, we get a short demo. For those who are curious, here's a &lt;a href="http://youtu.be/apElEaHZM2g"&gt;link to a YouTube video&lt;/a&gt; Matt made of Pendular when Roland release it.&lt;br /&gt;&lt;br /&gt;And to end, Roland announces the crowd that he'll be joining the dark side soon. Another Pentaho rock-star joining Pentaho. But don't worry, not every one in the community will join Pentaho :-)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-808eEzn5MzI/Tn3Z4U_aisI/AAAAAAAAA-Y/Py_sh75WL-U/s1600/Screen+Shot+2011-09-24+at+15.23.07.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-808eEzn5MzI/Tn3Z4U_aisI/AAAAAAAAA-Y/Py_sh75WL-U/s1600/Screen+Shot+2011-09-24+at+15.23.07.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;11h30&amp;nbsp;Gretchen Moran &lt;/b&gt;(&lt;a href="https://docs.google.com/viewer?a=v&amp;amp;pid=explorer&amp;amp;chrome=true&amp;amp;srcid=0B7pEch_luF0xODIzYjE2NTgtMWVhZS00ZGVlLTk0NDMtZjM2YWI5MmI4OWE0&amp;amp;hl=en_GB"&gt;presentation&lt;/a&gt;)&lt;br /&gt;And while Roland is joining Pentaho, Gretchen announces PCG that she' is leaving Pentaho. She'll be working with OpenMRS in the future and she'll&lt;br /&gt;&lt;br /&gt;&lt;a href="http://openmrs.org/"&gt;OpenMRS&lt;/a&gt; was created as a response to HIV. There was limited infrastructure and what was in place was totally overloaded. While HIV can be treated, tracking treatments for millions of patient requires however a serious information management system. That's what OpenMRS (Open Medical Record System) was created for.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-ovFx4F6STjQ/Tn2qit9VD1I/AAAAAAAAA9g/fY5W-yemQPM/s1600/IMAG1380.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://3.bp.blogspot.com/-ovFx4F6STjQ/Tn2qit9VD1I/AAAAAAAAA9g/fY5W-yemQPM/s400/IMAG1380.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;There's no "commercial side" to OpenMRS. OpenMRS have a great and large community. Gretchen has dived into that community in order to help them set up Pentaho BI with OpenMRS since they currently have no or very limited insight in their data. She has done a quick proof of concept for OpenMRS management who were extremely enthusiast.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-DZb44IuSYm4/Tn3aI3KbdgI/AAAAAAAAA-c/sBWgDPzi5es/s1600/Screen+Shot+2011-09-24+at+15.24.23.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-DZb44IuSYm4/Tn3aI3KbdgI/AAAAAAAAA-c/sBWgDPzi5es/s1600/Screen+Shot+2011-09-24+at+15.24.23.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;So this presentation really is a call to the community to help set up a 100% open source BI implementation for OpenMRS, using Saiku, C*tools, and all other funky community resources.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;11h45&amp;nbsp;Jos van Dongen &amp;amp; Aly Hollander &lt;/b&gt;(&lt;a href="https://docs.google.com/viewer?a=v&amp;amp;pid=explorer&amp;amp;chrome=true&amp;amp;srcid=0B7pEch_luF0xNTc2MjhhNDItZTkyNi00YmI0LThmMGUtYTQ2YjJhYTU5MWVi&amp;amp;hl=en_GB"&gt;presentation&lt;/a&gt;)&lt;br /&gt;And to continue in the medical field Jos and Aly present the Pentaho BI implementation at St. Antonius hospitals. St. Antonius have their home built system for the administration of the hospital. About 2 years back St. Antonius started deploying Pentaho BI on top of that system. Aly gives an overview of how their systems were developed and how the Pentaho BI project was run. Next Jos shows a great set of C*tools dashboards from the St Antonius implementation.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-n6DVNQoZZWw/Tn2z1aLZZdI/AAAAAAAAA9o/0I6jFrYGc-o/s1600/IMAG1386.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://4.bp.blogspot.com/-n6DVNQoZZWw/Tn2z1aLZZdI/AAAAAAAAA9o/0I6jFrYGc-o/s400/IMAG1386.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;12h15&amp;nbsp;Luc and Julian&lt;/b&gt; (&lt;a href="https://docs.google.com/viewer?a=v&amp;amp;pid=explorer&amp;amp;chrome=true&amp;amp;srcid=0B7pEch_luF0xOWU4MzMwMDctYjdjZS00NmEyLTg4MjctMTc4YTFjNjM0OGYx&amp;amp;hl=en_GB"&gt;presentation&lt;/a&gt;)&lt;br /&gt;Julian quickly explains how roles for Mondrian development are divided between himself and Luc. After that&amp;nbsp;Luc gives an overview of all the Mondrian 3.3 features.&amp;nbsp;All the new features are well documented in the presentation, so no need to fully replicate that here.&lt;br /&gt;&lt;br /&gt;Julian quickly updates us on the following topics:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Hyde Son 2.0, some time off for the kids which allows Mondrian coding time&lt;/li&gt;&lt;li&gt;His new MacBook air and when he'll start wearing designer clothes&lt;/li&gt;&lt;li&gt;OLAP4J 1.0&lt;/li&gt;&lt;li&gt;Mondrian Enterprise Cache&lt;/li&gt;&lt;li&gt;Mondrian 4.0&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-TwB5V-ytrk8/Tn2zxiljcGI/AAAAAAAAA9k/uQyL05UORzY/s1600/IMAG1392.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://3.bp.blogspot.com/-TwB5V-ytrk8/Tn2zxiljcGI/AAAAAAAAA9k/uQyL05UORzY/s400/IMAG1392.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-cXKXdrbO2vU/Tn3ZoRHyMkI/AAAAAAAAA-U/vpvSah-X5IE/s1600/Screen+Shot+2011-09-24+at+15.22.16.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-cXKXdrbO2vU/Tn3ZoRHyMkI/AAAAAAAAA-U/vpvSah-X5IE/s1600/Screen+Shot+2011-09-24+at+15.22.16.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Until now we've managed almost to stick to the talk schedule. Tom's presentation is the only one that will be shifted to the afternoon.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;13h00 Lunch&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-xFHYRlkptIQ/Tn3ZOWL0hjI/AAAAAAAAA-M/adGQ883k6hM/s1600/Screen+Shot+2011-09-24+at+15.20.30.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-xFHYRlkptIQ/Tn3ZOWL0hjI/AAAAAAAAA-M/adGQ883k6hM/s1600/Screen+Shot+2011-09-24+at+15.20.30.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-fdCghT2vlWI/Tn3QE6yprdI/AAAAAAAAA9s/6PrilkFPogQ/s1600/IMAG1393.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="119" src="http://3.bp.blogspot.com/-fdCghT2vlWI/Tn3QE6yprdI/AAAAAAAAA9s/6PrilkFPogQ/s200/IMAG1393.jpg" width="200" /&gt;&lt;/a&gt;&lt;a href="http://1.bp.blogspot.com/-rnygbDCnf0Q/Tn3QJQU2eNI/AAAAAAAAA9w/UcXtfhZEbvs/s1600/IMAG1394.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="119" src="http://1.bp.blogspot.com/-rnygbDCnf0Q/Tn3QJQU2eNI/AAAAAAAAA9w/UcXtfhZEbvs/s200/IMAG1394.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-A1ex3jTfC60/Tn3QMD5UDLI/AAAAAAAAA90/-suSbnIL8R8/s1600/IMAG1395.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="119" src="http://4.bp.blogspot.com/-A1ex3jTfC60/Tn3QMD5UDLI/AAAAAAAAA90/-suSbnIL8R8/s200/IMAG1395.jpg" width="200" /&gt;&lt;/a&gt;&lt;a href="http://2.bp.blogspot.com/-uVEzqhlLyLk/Tn3QOwTHj1I/AAAAAAAAA94/Hq2eEazygcU/s1600/IMAG1396.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="119" src="http://2.bp.blogspot.com/-uVEzqhlLyLk/Tn3QOwTHj1I/AAAAAAAAA94/Hq2eEazygcU/s200/IMAG1396.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-0pbhTM3_nME/Tn3QRW-cClI/AAAAAAAAA98/3G50vpGlvPo/s1600/IMAG1397.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://2.bp.blogspot.com/-0pbhTM3_nME/Tn3QRW-cClI/AAAAAAAAA98/3G50vpGlvPo/s400/IMAG1397.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;The general idea behind having the lunch in the hotel of the venue was to save time. Julian finished around 13h30 and we still managed to squeeze in time for starters, primi, desert and coffee in slightly under an hour leaving more time for presentations.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;14h30 Tom and Paul &lt;/b&gt;(&lt;a href="https://docs.google.com/viewer?a=v&amp;amp;pid=explorer&amp;amp;chrome=true&amp;amp;srcid=0B7pEch_luF0xZDhhOTYyMTAtMzhlZS00OWVjLTk2NjgtZGM1NDQ2YjIwNWY2&amp;amp;hl=en_GB"&gt;presentation&lt;/a&gt;)&lt;br /&gt;Paul shows off the Pentaho hosting environment that Meteorite is setting up. The original idea was to automate the set-up of a new Pentaho environment, but it grew into a full hosting offer. Through a simple web interface you can now generate your own Pentaho environment, choose what database you want underneath it, the hosting provider you want to run it with, etc. &amp;nbsp; &lt;a href="https://docs.google.com/viewer?a=v&amp;amp;pid=explorer&amp;amp;chrome=true&amp;amp;srcid=0B7pEch_luF0xZDhhOTYyMTAtMzhlZS00OWVjLTk2NjgtZGM1NDQ2YjIwNWY2&amp;amp;hl=en_GB"&gt;The slides tell the story&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-Cyb2aGsx4C8/Tn3Y9-lGPTI/AAAAAAAAA-I/Xm3OrpTz_Qs/s1600/Screen+Shot+2011-09-24+at+15.18.49.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="76" src="http://4.bp.blogspot.com/-Cyb2aGsx4C8/Tn3Y9-lGPTI/AAAAAAAAA-I/Xm3OrpTz_Qs/s320/Screen+Shot+2011-09-24+at+15.18.49.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;14:45 Pedro Pinhero&lt;/b&gt;&lt;br /&gt;Pedro demo's CDF for mobile platforms. No need for words, as you can access the demo directly here:&amp;nbsp;&lt;a href="http://goo.gl/pcbWv"&gt;http://goo.gl/pcbWv&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-aG9ln7nJGr4/Tn3YO2rAp6I/AAAAAAAAA-A/oogUoUFCMfk/s1600/IMAG1402.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://2.bp.blogspot.com/-aG9ln7nJGr4/Tn3YO2rAp6I/AAAAAAAAA-A/oogUoUFCMfk/s400/IMAG1402.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;15h10 Pedro Alves&lt;/b&gt;&lt;br /&gt;Pedro gives an overview of the Firefox Telemetry project.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-o1Eaj07DE7o/Tn3YThgmGMI/AAAAAAAAA-E/ZG2eIbg7yCA/s1600/IMAG1404.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://3.bp.blogspot.com/-o1Eaj07DE7o/Tn3YThgmGMI/AAAAAAAAA-E/ZG2eIbg7yCA/s400/IMAG1404.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;15h30 Matt Casters&lt;/b&gt;&lt;br /&gt;Pentaho's Chief Data Integration is back onstage with an explanation on the Single Threader step.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-ZfgVgI1psEA/Tn3fvUrmnkI/AAAAAAAAA-w/0hx99Nbi5-o/s1600/Screen+Shot+2011-09-24+at+15.48.01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-ZfgVgI1psEA/Tn3fvUrmnkI/AAAAAAAAA-w/0hx99Nbi5-o/s1600/Screen+Shot+2011-09-24+at+15.48.01.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-NArCG01qykQ/Tn3f4HDt5eI/AAAAAAAAA-0/PfV5P3wUEC4/s1600/Screen+Shot+2011-09-24+at+15.48.10.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-NArCG01qykQ/Tn3f4HDt5eI/AAAAAAAAA-0/PfV5P3wUEC4/s1600/Screen+Shot+2011-09-24+at+15.48.10.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;15h40 Cees Vkemenade&lt;/b&gt;&lt;br /&gt;Cees has worked with Jos on the St. Antonius Ziekenhuis project(s). When using CDF, they realized they needed a functionality to deploy changes to different environments.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-E_kJmvYJgkk/Tn3qHhnJfLI/AAAAAAAAA_A/UM2PanJ95I8/s1600/IMAG1410.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://1.bp.blogspot.com/-E_kJmvYJgkk/Tn3qHhnJfLI/AAAAAAAAA_A/UM2PanJ95I8/s400/IMAG1410.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-TnNnEqJ052c/Tn3hagy174I/AAAAAAAAA-4/NOOu705f_vw/s1600/Screen+Shot+2011-09-24+at+15.55.30.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-TnNnEqJ052c/Tn3hagy174I/AAAAAAAAA-4/NOOu705f_vw/s1600/Screen+Shot+2011-09-24+at+15.55.30.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;15h50 Thomas Morgner &lt;/b&gt;(&lt;a href="http://www.sherito.org/2011/09/cdf-based-parameter-viewer.html"&gt;blog post&lt;/a&gt;)&lt;br /&gt;Thomas shows how you can set up a light-weight report viewer (no GWT) as a follow up of a &lt;a href="http://www.sherito.org/2011/06/creating-your-own-parameter-ui-for.html"&gt;blog post&lt;/a&gt; he did a short time back. Given the enthusiast feedback he got on this post, he also goes for some discussion on how these ideas could evolve.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-NRqKJ1P7DxI/Tn3p7rTEcLI/AAAAAAAAA-8/alcoOG9Cy2o/s1600/IMAG1412.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://3.bp.blogspot.com/-NRqKJ1P7DxI/Tn3p7rTEcLI/AAAAAAAAA-8/alcoOG9Cy2o/s400/IMAG1412.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;16h20 Bart Maertens&lt;/b&gt;&lt;br /&gt;Bart gives us an update of his &lt;b&gt;BIRT plugin for the Pentaho BI server&lt;/b&gt;. Even though obviously BIRT is similar to PRD, there are a few functionalities (as cross tab reports) which aren't available in PRD, that might make it interesting to use BIRT for reporting.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-IcaNjU-d1HI/Tn3qgBdtmYI/AAAAAAAAA_E/xbidot2dio8/s1600/IMAG1413.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://2.bp.blogspot.com/-IcaNjU-d1HI/Tn3qgBdtmYI/AAAAAAAAA_E/xbidot2dio8/s400/IMAG1413.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;Bart quickly shows how the plugin should be installed/configured and then shows how that works and looks like in the BI server. For those interested, here's some blogposts bart did on the subject.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://b-e-o.blogspot.com/2010/09/running-fully-functional-eclipse-birt.html"&gt;http://b-e-o.blogspot.com/2010/09/running-fully-functional-eclipse-birt.html&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://b-e-o.blogspot.com/2011/08/running-fully-functional-eclipse-birt.html"&gt;http://b-e-o.blogspot.com/2011/08/running-fully-functional-eclipse-birt.html&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;A BIRT reporting output step has been developed to. Bart based himself on the PRD Output Step in PDI. So it is now possible to set up a scheduling of BIRT reports through Pentaho Data Integration. The code for the step is already part of the kettle trunk.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;16h30 Coffee break and group picture&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-K_-osfXZQBY/Tn4LjQYGLyI/AAAAAAAAA_4/ozhMBqNryiI/s1600/DSC_0039.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="265" src="http://1.bp.blogspot.com/-K_-osfXZQBY/Tn4LjQYGLyI/AAAAAAAAA_4/ozhMBqNryiI/s400/DSC_0039.JPG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;object width="320" height="266" class="BLOG_video_class" id="BLOG_video-b29db2cc3b5c3430" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"&gt;&lt;param name="movie" value="http://www.youtube.com/get_player"&gt;&lt;param name="bgcolor" value="#FFFFFF"&gt;&lt;param name="allowfullscreen" value="true"&gt;&lt;param name="flashvars" value="flvurl=http://v10.nonxt3.googlevideo.com/videoplayback?id%3Db29db2cc3b5c3430%26itag%3D5%26app%3Dblogger%26ip%3D0.0.0.0%26ipbits%3D0%26expire%3D1331735206%26sparams%3Did,itag,ip,ipbits,expire%26signature%3D2BCB407D7A3973C7D33B5DD687EA831547C40A9B.2F38E37DB82CED95E4C478685299DF125D1CAA91%26key%3Dck1&amp;amp;iurl=http://video.google.com/ThumbnailServer2?app%3Dblogger%26contentid%3Db29db2cc3b5c3430%26offsetms%3D5000%26itag%3Dw160%26sigh%3Dc3rbjnmLUwNaocBAtHgM3EmrvTY&amp;amp;autoplay=0&amp;amp;ps=blogger"&gt;&lt;embed src="http://www.youtube.com/get_player" type="application/x-shockwave-flash"width="320" height="266" bgcolor="#FFFFFF"flashvars="flvurl=http://v10.nonxt3.googlevideo.com/videoplayback?id%3Db29db2cc3b5c3430%26itag%3D5%26app%3Dblogger%26ip%3D0.0.0.0%26ipbits%3D0%26expire%3D1331735206%26sparams%3Did,itag,ip,ipbits,expire%26signature%3D2BCB407D7A3973C7D33B5DD687EA831547C40A9B.2F38E37DB82CED95E4C478685299DF125D1CAA91%26key%3Dck1&amp;iurl=http://video.google.com/ThumbnailServer2?app%3Dblogger%26contentid%3Db29db2cc3b5c3430%26offsetms%3D5000%26itag%3Dw160%26sigh%3Dc3rbjnmLUwNaocBAtHgM3EmrvTY&amp;autoplay=0&amp;ps=blogger"allowFullScreen="true" /&gt;&lt;/object&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Only coffee though:&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-wYyBsjEOwJ8/Tn34qkLY16I/AAAAAAAAA_I/pcF9SoqLuTQ/s1600/Screen+Shot+2011-09-24+at+17.34.12.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-wYyBsjEOwJ8/Tn34qkLY16I/AAAAAAAAA_I/pcF9SoqLuTQ/s1600/Screen+Shot+2011-09-24+at+17.34.12.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;17h00 Jens Bleuel &lt;/b&gt;(presentation)&lt;/div&gt;This year Jens moved from Pentaho's professional services division to product management.&amp;nbsp;Jens has been working over the last year on connecting PDI to his son's electric train. kettle controls the stops of the trains through a relay.&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;iframe allowfullscreen="" frameborder="0" height="349" src="http://www.youtube.com/embed/gBxDmFBf8ME?hl=en&amp;amp;fs=1" width="425"&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;br /&gt;(also in &lt;a href="http://www.youtube.com/watch?v=KqyZyAxWheQ&amp;amp;feature=youtube_gdata_player"&gt;high quality&lt;/a&gt;)&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;He has made some of the information from this project available on the &lt;a href="http://wiki.pentaho.com/display/EAI/Kettle+Exchange"&gt;kettle exchange site&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-hbTUUcjuvac/Tn3486gnzDI/AAAAAAAAA_M/hQnnXRtR3CY/s1600/Screen+Shot+2011-09-24+at+17.35.51.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-hbTUUcjuvac/Tn3486gnzDI/AAAAAAAAA_M/hQnnXRtR3CY/s1600/Screen+Shot+2011-09-24+at+17.35.51.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;He also shared some ideas he is working on in PDI product management as:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;lifecycle management&lt;/li&gt;&lt;li&gt;the heat maps&lt;/li&gt;&lt;li&gt;implement ID's for PDI steps&amp;nbsp;&lt;/li&gt;&lt;li&gt;best practices, concepts and solutions on Wiki&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;17h25 Paul &lt;/b&gt;(&lt;a href="https://docs.google.com/viewer?a=v&amp;amp;pid=explorer&amp;amp;chrome=true&amp;amp;srcid=0B7pEch_luF0xODE2ZWQ2NzEtYWFkMC00NGRhLWE0OTctZGQwYjE4NWZmMDli&amp;amp;hl=en_GB"&gt;presentation&lt;/a&gt;)&lt;br /&gt;Paul is back with an overview of Saiku. Since the PAT presentation of last year the team was extended. Paul and Tom are now assisted by 2 other guys. Paul demo's all the cool features of Saiku and gives an overview of the road map. All very well documented in the slides.&lt;br /&gt;&lt;br /&gt;Some really exciting stuff is almost ready to put int Saiku 2.2&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Graphs (through CCC)&lt;/li&gt;&lt;li&gt;Write-back (with different allocation policies)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-tJHdTQ_yv7I/Tn36szbfk0I/AAAAAAAAA_U/X7SVcLX2RwM/s1600/Screen+Shot+2011-09-24+at+17.43.30.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-tJHdTQ_yv7I/Tn36szbfk0I/AAAAAAAAA_U/X7SVcLX2RwM/s1600/Screen+Shot+2011-09-24+at+17.43.30.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-vA-V3PFyCGc/Tn37ftTKEDI/AAAAAAAAA_Y/8kimrIaLqc8/s1600/Screen+Shot+2011-09-24+at+17.46.48.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-vA-V3PFyCGc/Tn37ftTKEDI/AAAAAAAAA_Y/8kimrIaLqc8/s1600/Screen+Shot+2011-09-24+at+17.46.48.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;No kidding! This one is absolutely mega!!!&lt;/div&gt;&lt;br /&gt;... and what else will follow for 2.2 are security (working with roles) and using Saiku on mobile platforms.&lt;br /&gt;&lt;br /&gt;Some stuff that's in the backlog:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;C*tools integration&lt;/li&gt;&lt;li&gt;PRD export&lt;/li&gt;&lt;li&gt;Drill-down&lt;/li&gt;&lt;li&gt;Parameters&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;b&gt;17h50 Marioz&lt;/b&gt;&lt;br /&gt;Marioz is showing a light weight ad-hoc reporting tool which uses the Saiku interface to report on top of a meta-data layer in order to generate and execute queries. He also wants to merge PRPT functionality into this in order to leverage PRD's lay-outing possibilities.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-IHEVHtZlPbk/Tn4BSaCbmJI/AAAAAAAAA_k/ggRelSlsmMQ/s1600/Screen+Shot+2011-09-24+at+18.11.32.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-IHEVHtZlPbk/Tn4BSaCbmJI/AAAAAAAAA_k/ggRelSlsmMQ/s1600/Screen+Shot+2011-09-24+at+18.11.32.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-tGQ_KTp_EU0/Tn4BgkKWL1I/AAAAAAAAA_o/Es4Ss4_hNBs/s1600/IMAG1415.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://4.bp.blogspot.com/-tGQ_KTp_EU0/Tn4BgkKWL1I/AAAAAAAAA_o/Es4Ss4_hNBs/s400/IMAG1415.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;18h00 Ricardo Pires &lt;/b&gt;(presentation)&lt;br /&gt;Ricardo gives an overview of the &lt;a href="http://code.google.com/p/pentaho-fc-plugin/"&gt;Pentaho Fusion Charts Plugin&lt;/a&gt; XpandIT have developed (see also &lt;a href="http://wiki.pentaho.com/display/COM/April+20,+2011+-+Ricardo+Pires+-+Xpand+IT+FusionCharts+Plug-in"&gt;WebEx&lt;/a&gt; on this topic earlier this year). There have been a lot of downloads since it was released so it would seem there is a lot of interest for this type of charting in Pentaho.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-FQaj7EBJIms/Tn3_tk7gbbI/AAAAAAAAA_c/iV3fjByc7fY/s1600/IMAG1417.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://1.bp.blogspot.com/-FQaj7EBJIms/Tn3_tk7gbbI/AAAAAAAAA_c/iV3fjByc7fY/s400/IMAG1417.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-XrDuz4TxdYM/Tn3_0k4u4VI/AAAAAAAAA_g/KyCjlIyHcNU/s1600/IMAG1418.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://4.bp.blogspot.com/-XrDuz4TxdYM/Tn3_0k4u4VI/AAAAAAAAA_g/KyCjlIyHcNU/s400/IMAG1418.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;And there is a small tutorial on YouTube.&lt;br /&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: center;"&gt;&lt;iframe allowfullscreen="" frameborder="0" height="315" src="http://www.youtube.com/embed/VPSaPm5-rgs" width="560"&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;b&gt;18h10 Slawo Chodnicki&lt;/b&gt;&lt;br /&gt;Slawo has been working on a &lt;b&gt;Ruby scripting step &lt;/b&gt;for PDI that used JRuby&lt;b&gt;. &lt;/b&gt;More information and the code are as always well clearly laid out on his &lt;a href="http://type-exit.org/adventures-with-open-source-bi/2011/03/releasing-the-ruby-scripting-step-for-kettle/"&gt;blog&lt;/a&gt;.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/--FnO2_yQV2c/Tn4D2mOWRNI/AAAAAAAAA_w/YJwDNnkDNnM/s1600/Screen+Shot+2011-09-24+at+18.21.43.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/--FnO2_yQV2c/Tn4D2mOWRNI/AAAAAAAAA_w/YJwDNnkDNnM/s1600/Screen+Shot+2011-09-24+at+18.21.43.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-bL1m3so-gvE/Tn4D20nKm3I/AAAAAAAAA_0/vamLD17KFh4/s1600/Screen+Shot+2011-09-24+at+18.22.41.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-bL1m3so-gvE/Tn4D20nKm3I/AAAAAAAAA_0/vamLD17KFh4/s1600/Screen+Shot+2011-09-24+at+18.22.41.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;18h40 Roland Bouman &lt;/b&gt;(presentation)&lt;br /&gt;And to wrap up, Roland is giving a talk about &lt;b&gt;data vault and anchor modeling&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-ZAR_ocWQJeo/Tn4L40FX_KI/AAAAAAAAA_8/KLwgGar1EH8/s1600/Screen+Shot+2011-09-24+at+18.56.36.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-ZAR_ocWQJeo/Tn4L40FX_KI/AAAAAAAAA_8/KLwgGar1EH8/s1600/Screen+Shot+2011-09-24+at+18.56.36.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-yuRl7jIpP7M/Tn4M2JK84FI/AAAAAAAABAE/z4nh-TrGojE/s1600/Screen+Shot+2011-09-24+at+19.00.40.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-yuRl7jIpP7M/Tn4M2JK84FI/AAAAAAAABAE/z4nh-TrGojE/s1600/Screen+Shot+2011-09-24+at+19.00.40.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;b&gt;And #PCM11 / #PCG11 is over&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;... thanks for joining us!&lt;/b&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="color: red;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-BwFdcnsla5U/Tn35IiEuAiI/AAAAAAAAA_Q/TiMAy6Ft4ho/s1600/Screen+Shot+2011-09-24+at+17.36.32.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-BwFdcnsla5U/Tn35IiEuAiI/AAAAAAAAA_Q/TiMAy6Ft4ho/s1600/Screen+Shot+2011-09-24+at+17.36.32.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="font-weight: normal; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-4633809013293558229?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/4633809013293558229/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/09/pentaho-community-gathering-live.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/4633809013293558229'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/4633809013293558229'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/09/pentaho-community-gathering-live.html' title='Pentaho Community Gathering (Live)'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-9YY6V6IQMi8/Tn2ZQXyrrfI/AAAAAAAAA9E/7QtvaJtZtlE/s72-c/IMAG1359.jpg' height='72' width='72'/><thr:total>5</thr:total><georss:featurename>Via Vittorio Veneto, 6-14, 00044 Frascati Rome, Italy</georss:featurename><georss:point>41.80519779579733 12.67648458480835</georss:point><georss:box>41.80371829579733 12.67401708480835 41.806677295797336 12.67895208480835</georss:box></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-5499378032202696708</id><published>2011-09-20T15:34:00.002+02:00</published><updated>2011-09-24T10:52:50.364+02:00</updated><title type='text'>Once twitten, twice shy</title><content type='html'>In the preparation for the &lt;a href="http://wiki.pentaho.com/display/COM/Pentaho+Community+Gathering+-+Rome+%28Frascati%29+2011"&gt;Pentaho Community Meeting 2011 in Frascati&lt;/a&gt;, I suddenly recalled some events that occurred during the &lt;a href="http://forums.pentaho.com/showthread.php?81263-Pentaho-Community-Meetup-2011"&gt;voting for the location&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-shUaPd0XbN4/TniL74FGBTI/AAAAAAAAA7c/xZh-oWfWPmA/s1600/Screen+Shot+2011-09-20+at+14.48.46.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="83" src="http://1.bp.blogspot.com/-shUaPd0XbN4/TniL74FGBTI/AAAAAAAAA7c/xZh-oWfWPmA/s400/Screen+Shot+2011-09-20+at+14.48.46.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;One week before the closing of the votes, @magicaltrout fueled the discussion.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-nUS320fNQgo/TniUr-707eI/AAAAAAAAA7o/630_UmZlhx0/s1600/Screen+Shot+2011-09-20+at+15.25.29.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="68" src="http://2.bp.blogspot.com/-nUS320fNQgo/TniUr-707eI/AAAAAAAAA7o/630_UmZlhx0/s400/Screen+Shot+2011-09-20+at+15.25.29.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Luckily Twitter allowed me to go back in time to dig up some tweets from prominent members of the Pentaho Community.&amp;nbsp;There seems to be a clear indication that members have been trying to "buy votes" in order to influence the location of the event. Below tweets clearly illustrate this.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-M1ZI8kuts3I/TniMLDnSYpI/AAAAAAAAA7g/5IUec-O0RDE/s1600/ScreenShot004.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="66" src="http://3.bp.blogspot.com/-M1ZI8kuts3I/TniMLDnSYpI/AAAAAAAAA7g/5IUec-O0RDE/s400/ScreenShot004.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-b8agCPQ4SP0/TniMbNReJcI/AAAAAAAAA7k/Qn_IkzBxqXI/s1600/ScreenShot001.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="66" src="http://1.bp.blogspot.com/-b8agCPQ4SP0/TniMbNReJcI/AAAAAAAAA7k/Qn_IkzBxqXI/s400/ScreenShot001.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-St_VBSdzAOI/TniVBUE1PSI/AAAAAAAAA7s/NAZh7cs6Mq4/s1600/Screen+Shot+2011-09-20+at+15.28.07.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="66" src="http://1.bp.blogspot.com/-St_VBSdzAOI/TniVBUE1PSI/AAAAAAAAA7s/NAZh7cs6Mq4/s400/Screen+Shot+2011-09-20+at+15.28.07.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-m4bmeg_E-VU/TniV8P6r57I/AAAAAAAAA7w/BI73ai1ALBw/s1600/Screen+Shot+2011-09-20+at+15.32.00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="81" src="http://2.bp.blogspot.com/-m4bmeg_E-VU/TniV8P6r57I/AAAAAAAAA7w/BI73ai1ALBw/s400/Screen+Shot+2011-09-20+at+15.32.00.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Now, I don't believe there is any rule against such bribery as long as the bribes are being paid.&lt;br /&gt;&lt;br /&gt;So, if I've been reading right, we should figure out when the community can cash in on the following:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;beers from @josvandongen&lt;/li&gt;&lt;li&gt;wine from @mattcasters&lt;/li&gt;&lt;li&gt;grappa from @rolandbouwman&lt;/li&gt;&lt;li&gt;home brew liquor from @jan_aertsen&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Being on the list, I suggest I'll bring my bottles on friday evening. I guess the Community will figure out a right place and time to consume!&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As for the others: your words have not been forgotten. Watch out what you tweet next time.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-5499378032202696708?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/5499378032202696708/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/09/once-twitten-twice-shy.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5499378032202696708'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5499378032202696708'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/09/once-twitten-twice-shy.html' title='Once twitten, twice shy'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-shUaPd0XbN4/TniL74FGBTI/AAAAAAAAA7c/xZh-oWfWPmA/s72-c/Screen+Shot+2011-09-20+at+14.48.46.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6747578353067213715</id><published>2011-08-30T23:11:00.001+02:00</published><updated>2011-08-30T23:12:55.424+02:00</updated><title type='text'>PCG11 Prologue</title><content type='html'>Pentaho usage is growing rapidly, and thus those who have Pentaho skills are very busy. As a consequence, the organization of the annual Pentaho Community Gathering has been 'a bit slow' this year. And while Pentaho employees normally aren't allowed to stick their nose in community business, somehow the organization of this years venue ended up on my task list. (Thanks Doug!)&lt;br /&gt;&lt;br /&gt;So there I was, mid August, 40 days to go, nothing set. I asked my wife to call some hotels and by August 22nd answers arrived. After some comparison of offers a decision was made on order of preference. To finalize the selection process and in order to make sure the community would get the very best location on this side of the planet, I decided to go and have a look on the spot.&lt;br /&gt;&lt;br /&gt;I jumped in my car and started driving, whilst punching the address of the first hotel into the GPS. Only 1576km and I could be there by the next morning 7AM. I've done BI projects that lasted a lot longer, so piece of cake. Here's the dashboard (sorry Pedro, not done with Ctools):&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-2197f5_JbBM/TleEoCi3BZI/AAAAAAAAA00/f36VmT-Qfks/s1600/IMAG1300.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="640" src="http://3.bp.blogspot.com/-2197f5_JbBM/TleEoCi3BZI/AAAAAAAAA00/f36VmT-Qfks/s640/IMAG1300.jpg" width="380" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Unfortunately, weather conditions weren't that good. While driving through France heavy rain showers hit me. It slowed me down seriously and even obliged me to stop over in Switserland, waiting for things to clear up. And if that wasn't enough, when I started the journey again, the next day, a sudden temperature drop made things even worse.&lt;br /&gt;&lt;br /&gt;Here's the dashboard (because we love them so much) showing the situation (again Pedro, sorry no Ctools):&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-GBFSjwuMJzM/Tlq4FZxQCrI/AAAAAAAAA3g/9tcuALxFgrY/s1600/IMAG1314.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://4.bp.blogspot.com/-GBFSjwuMJzM/Tlq4FZxQCrI/AAAAAAAAA3g/9tcuALxFgrY/s400/IMAG1314.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Yes, if you look carefully the dashboard says &lt;u&gt;zero degrees&lt;/u&gt;. And for those who think you can prove anything with numbers and dashboards, here's the pictures of the road situation at that time.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-YfCORcSjuGc/Tlq4fV2V6AI/AAAAAAAAA3k/ta26pR112nk/s1600/IMAG1315.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="119" src="http://1.bp.blogspot.com/-YfCORcSjuGc/Tlq4fV2V6AI/AAAAAAAAA3k/ta26pR112nk/s200/IMAG1315.jpg" width="200" /&gt;&lt;/a&gt;&lt;a href="http://3.bp.blogspot.com/-P2MV_PtXDgI/Tlq4l9sqL1I/AAAAAAAAA3o/H5vTQDLd4cA/s1600/IMAG1319.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="119" src="http://3.bp.blogspot.com/-P2MV_PtXDgI/Tlq4l9sqL1I/AAAAAAAAA3o/H5vTQDLd4cA/s200/IMAG1319.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;But a Community Gathering had to be booked. And I needed to move on. Once I passed the alps weather cleared up seriously, and I started trying to catch up the time I lost. Just after lunch temperatures started to be more acceptable (33 degrees) and I had found a nice cruise speed to make it to Frascati in a reasonable hour.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-ykKhgF8Fzdo/Tlq5mT2SdgI/AAAAAAAAA3s/6pgEv15CKK8/s1600/IMAG1329.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://2.bp.blogspot.com/-ykKhgF8Fzdo/Tlq5mT2SdgI/AAAAAAAAA3s/6pgEv15CKK8/s400/IMAG1329.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;And around 15h30, I finally reached Frascati. I visited some of the locations, I finally opted for &lt;a href="http://www.hotel-flora.it/"&gt;Hotel Flora&lt;/a&gt;,&amp;nbsp;as you can read by now on the &lt;a href="http://wiki.pentaho.com/display/COM/Pentaho+Community+Gathering+-+Rome+%28Frascati%29+2011"&gt;PCM11 wiki pages&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Why do I want to share this silly story? Basically,&amp;nbsp;I'd like to express my great appreciation for all the people who volunteered to set up the Pentaho Community Gatherings in the past. Organizing such an event does take time, and it is amazing that Pentaho Community members have always picked this up, notwithstanding busy agenda's and thousands of more interesting things to do :-). So, to who picked this up in the past, and to who'll pick this up in the future: a great thanks.&lt;/b&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6747578353067213715?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6747578353067213715/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/08/pcg11-prologue.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6747578353067213715'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6747578353067213715'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/08/pcg11-prologue.html' title='PCG11 Prologue'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-2197f5_JbBM/TleEoCi3BZI/AAAAAAAAA00/f36VmT-Qfks/s72-c/IMAG1300.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6531665014341708037</id><published>2011-04-21T10:46:00.001+02:00</published><updated>2011-04-21T10:46:49.589+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>Proprietary BI vendors feel the heat</title><content type='html'>Commercial open source software for BI has been coming up strong for the last years and keeps on marching forward at a rapid pace. For several years now, we have experienced in several projects that commercial open source BI is a viable alternative for the traditional vendors. Gartner has been slow to confirm that (as they usually are when it comes to open source), but they their recent market studies now include open source consistently. And today I noticed that also the proprietary vendors have to publicly admit that they feel the competition from the open source vendors. Look at &lt;a href="http://www.microstrategy.com/software/comparison/"&gt;this page on the MicroStrategy website&lt;/a&gt;. They clearly feel they need to compare themselves to Pentaho, and since they are comparing to version 3.6 which has been out for about a year, I'd guess they have been doing a lot of explaining to their customers since some time already.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-dvhXkcCciiQ/Ta_uJluXAnI/AAAAAAAAAkA/zR3K8kGeZjU/s1600/Knipsel.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0"  i8="true" src="http://1.bp.blogspot.com/-dvhXkcCciiQ/Ta_uJluXAnI/AAAAAAAAAkA/zR3K8kGeZjU/s400/Knipsel.PNG" width="560" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6531665014341708037?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6531665014341708037/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/04/proprietary-bi-vendors-feel-heat.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6531665014341708037'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6531665014341708037'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/04/proprietary-bi-vendors-feel-heat.html' title='Proprietary BI vendors feel the heat'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-dvhXkcCciiQ/Ta_uJluXAnI/AAAAAAAAAkA/zR3K8kGeZjU/s72-c/Knipsel.PNG' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7411927904124144620</id><published>2011-03-09T16:54:00.000+01:00</published><updated>2011-03-09T16:54:01.857+01:00</updated><title type='text'>I feel rejected!</title><content type='html'>&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;On september 21st 2010, I wrote in &lt;a href="http://kjube.blogspot.com/2010/09/kff-slowly-coming-out-of-kitchen-closet.html"&gt;this blog post&lt;/a&gt; the following lines.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-size: small;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;ul style="font-family: Arial,Helvetica,sans-serif; text-align: justify;"&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;Rejects&lt;/b&gt;: &amp;nbsp;A generic component for error handling. It will  converge all error records into one common format so all your rejects  fit in one and the same output file or table. We'll elaborate on this  one soon in an extra blogpost. Documentation will be added to the &lt;b&gt;&lt;a href="http://code.kjube.be/"&gt;KFF pages&lt;/a&gt;&lt;/b&gt;.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-size: small;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Today, while on the phone with "&lt;a href="http://ramathoughts.blogspot.com/"&gt;a fan of the art of data integration through the means of Pentaho Data Integration&lt;/a&gt;", I realised that I never wrote that extra blogpost, and neither did anyone update the KFF pages with regard to the "Rejects Step". So it is time I amend.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif;"&gt;Error handling in kettle&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;The standard error handling in kettle is probably well know to all of you. For those who aren't up to speed, a quick intro follows here.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt; A common scenario is shown below.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;a href="https://lh4.googleusercontent.com/-H2pkutR1S5o/TXeRB7LRIpI/AAAAAAAAAjQ/yN_jIS-FKn8/s1600/screenshot013.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="https://lh4.googleusercontent.com/-H2pkutR1S5o/TXeRB7LRIpI/AAAAAAAAAjQ/yN_jIS-FKn8/s400/screenshot013.png" width="580" /&gt;&lt;/a&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;You have stored information in the staging area of your data warehouse, which you need to load into an ODS. The ODS area of your data warehouse however needs to contain validated or clean data. Therefore your ODS table(s) may have foreign key or other constraints to ensure that your data is correct. Consequently any record&amp;nbsp; you try to write to the ODSTable, but that isn't allowed by the data model for whatever reason (value larger than the field, precision not correct, foreign key not found, ...) will throw an error and your transformation will end with errors.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;This will make you, as an professional data integration designer/developer, very unhappy. So you will adjust your transformation and put a lot of logic between the data in and data out steps, to ensure that everything can be loaded correctly to the target model.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh6.googleusercontent.com/-Qsdzsu56UsY/TXeRS4YrbXI/AAAAAAAAAjU/_FcHHEnpsgs/s1600/screenshot014.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="https://lh6.googleusercontent.com/-Qsdzsu56UsY/TXeRS4YrbXI/AAAAAAAAAjU/_FcHHEnpsgs/s400/screenshot014.png" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;But nevermind how precise and clearvoyant you are, at some point in time you will have some nasty record passing by for which you didn't write the right code, and bang, your transformation goes in error. Then you discover kettle has error handling functionality for these nasty records. Yes, kettle can divert records, which the data output steps cannot process, to another step. That looks as follows.&lt;/span&gt;&lt;br /&gt;&lt;a href="https://lh4.googleusercontent.com/-y5PIcEaspsw/TXeRx0mFB1I/AAAAAAAAAjY/Ox-CoNwJq0I/s1600/screenshot015.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="https://lh4.googleusercontent.com/-y5PIcEaspsw/TXeRx0mFB1I/AAAAAAAAAjY/Ox-CoNwJq0I/s320/screenshot015.png" width="580" /&gt;&lt;/a&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Basically, what is done very often, is just divert the records that cannot be inserted into the database to a flat file, and leave it there so some one can have a look at why this record wasn't inserted into the ODS or data warehouse. At the same time this permits your data flow to end without throwing errors and aborting in the mid of the night so you need to get up and restart that data warehouse load or else you'll find 10 angry managers at your desk in the morning.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;In the data output step where you enable the error handling (in this case the ODSTable step), you can decide some extra settings for the error handling.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh4.googleusercontent.com/-tUxel22OmxQ/TXeYxKnIIUI/AAAAAAAAAjc/CTGbiFlqNbU/s1600/screenshot016.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="https://lh4.googleusercontent.com/-tUxel22OmxQ/TXeYxKnIIUI/AAAAAAAAAjc/CTGbiFlqNbU/s320/screenshot016.png" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp; &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;For those who want to read up bit on this feature, go and surf the net. Try these pages:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;a href="http://www.ibridge.be/?p=32"&gt;original blog post&lt;/a&gt; from 2007 when &lt;a href="http://www.ibridge.be/?page_id=2"&gt;Matt&lt;/a&gt; first released error handling&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Pentaho wiki: the &lt;a href="http://wiki.pentaho.com/display/EAI/Data+Validator"&gt;data validator and error handling&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Pentaho wiki: &lt;a href="http://wiki.pentaho.com/display/COM/Step+error+handling+codes"&gt;error handling codes&lt;/a&gt;&amp;nbsp; &lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;a href="http://type-exit.org/adventures-with-open-source-bi/about/"&gt;Slawo&lt;/a&gt; in 2010 on &lt;a href="http://type-exit.org/adventures-with-open-source-bi/2010/06/error-handling-in-the-javascript-step/"&gt;Javascript step and error handling&lt;/a&gt; &lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;The documentation isn't abundant, but the information is out there. My little scenario ends here, because I really want to talk about the '&lt;b&gt;Rejects step&lt;/b&gt;' and not do a write up on Error handling in PDI.&amp;nbsp; So here I go.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;span style="font-size: small;"&gt;Why a rejects step, if kettle already does it all?&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;As you might guess from the above scenario, one of the problems you often end up with if you push all the rejects to flat files, is that you end up with a large amount of files, with a lot of records to analyze for data quality. If you are lucky, you are working with to shelf data, and you'll have few rejected records and few files to look at, but most people will not be that lucky.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;So as your data warehouse grows (not only in size, but also in data integration logic and code), the amount of rejects files will grow and you'll loose the overview of what is happening in terms of data quality.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;To avoid this problem, we came up with the idea to merge all rejects into one single database table. This has some advantages:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Having all your rejects in one single place, makes it easy to make some statistics on your rejected records.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;If they are in a database table, they really are just part of your data integration logging, and therefore you can easily include them in the logging reports you already make every day to see/show how the ETL runs have been going.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;So what does it look like?&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh6.googleusercontent.com/-eMufXnkk2jk/TXeY8n482DI/AAAAAAAAAjg/umu1k3SWCBU/s1600/screenshot018.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="https://lh6.googleusercontent.com/-eMufXnkk2jk/TXeY8n482DI/AAAAAAAAAjg/umu1k3SWCBU/s320/screenshot018.png" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Basically nothing has been modified to the standard error handling mechanism of PDI. The only thing we added was the output step for the rejects. In this step you can define the following things:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;i&gt;&lt;b&gt;Rejected records connection&lt;/b&gt;&lt;/i&gt;: database connection to which you want to write the rejects (this can be a parameter)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;i&gt;&lt;b&gt;Rejected records schema&lt;/b&gt;&lt;/i&gt;: database schema o which you want to write the rejects (this can be a parameter)&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;i&gt;&lt;b&gt;Rejected records table&lt;/b&gt;&lt;/i&gt;: database table to which you want to write the rejects (this can be a parameter) &lt;/span&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;i&gt;&lt;b&gt;Error count field&lt;/b&gt;&lt;/i&gt;: name that you want to give the column containing the error count (this can be a parameter) &lt;/span&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;i&gt;&lt;b&gt;Error descriptions field&lt;/b&gt;&lt;/i&gt;: name that you want to give the column containing the error description (this can be a parameter) &lt;/span&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;i&gt;&lt;b&gt;Error fields field&lt;/b&gt;&lt;/i&gt;: name that you want to give the column containing the error fields (this can be a parameter) &lt;/span&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;i&gt;&lt;b&gt;Error codes field&lt;/b&gt;&lt;/i&gt;: name that you want to give the column containing the error code (this can be a parameter) &lt;/span&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Below those input fields you can specify which list of incoming fields make up the key of your record.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh5.googleusercontent.com/-Pp1gzFYAMFs/TXeZQH0Bp8I/AAAAAAAAAjk/XhHc4JjBa7s/s1600/screenshot019.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="https://lh5.googleusercontent.com/-Pp1gzFYAMFs/TXeZQH0Bp8I/AAAAAAAAAjk/XhHc4JjBa7s/s320/screenshot019.png" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;The rejects table will have the following lay-out.&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;ul&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;id: This column contains all the key fields of your rejected record in comma separated format (provided you specified them in the rejects step)&lt;/span&gt;&lt;span style="font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;value: &lt;/span&gt;&lt;span style="font-size: small;"&gt;This column contains all the remaining (non-key) fields of your rejected record in comma separated format&lt;/span&gt;&lt;span style="font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;batch_id: Your batch ID (see KFF)&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;transname: name of the transformation that generated the reject&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;jobname: name of the job that generated the reject&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;stepname: name of the step that generated the reject&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;${error_count_field}: nbr of errors&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;${error_descriptions_field}: error description&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;${error_fields}: field that generated the error&lt;/span&gt;&lt;/li&gt;&lt;li style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;${error_codes}: error code&lt;/span&gt;&lt;span style="font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif;"&gt;logdate: date/time of the reject&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Simple enough to write some statistics on. Should I want to know which steps generate most errors, a simple query of the type: &lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: small;"&gt;select stepname, count(*)&lt;/span&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: small;"&gt;from ${rejected_records_table}&lt;/span&gt;&lt;/div&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;group by stepname&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;would tell me enough.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;Now if you add the element time to this, storing your rejects for each nightly batch run, you can plot a nice evolution of how you have been handling data quality problems over time. Any data warehouse project will sooner or later require this kind of statistics, because data quality problems are there. The question is only when they will surface.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Arial,Helvetica,sans-serif; font-size: small;"&gt;&lt;b&gt;Good luck with it!&lt;/b&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7411927904124144620?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7411927904124144620/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/03/i-feel-rejected.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7411927904124144620'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7411927904124144620'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/03/i-feel-rejected.html' title='I feel rejected!'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh4.googleusercontent.com/-H2pkutR1S5o/TXeRB7LRIpI/AAAAAAAAAjQ/yN_jIS-FKn8/s72-c/screenshot013.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-3787743112724010749</id><published>2011-03-08T12:21:00.000+01:00</published><updated>2011-03-08T12:21:58.151+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><category scheme='http://www.blogger.com/atom/ns#' term='Book'/><title type='text'>Another kettle book on its way ...</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: justify;"&gt;For the kettle afinados out there, two excellent PDI books are already available namely:&amp;nbsp;&lt;/div&gt;&lt;ul style="text-align: left;"&gt;&lt;li style="text-align: justify;"&gt;&lt;a href="https://www.packtpub.com/pentaho-32-data-integration-beginners-guide/book"&gt;Pentaho 3.2 Data Integration - Beginners Guide&lt;/a&gt;, by&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;a href="https://www.packtpub.com/authors/profiles/maria-carina-roldan"&gt;María Carina Roldán&lt;/a&gt;&amp;nbsp;was released in april 2010.&lt;/span&gt;&lt;/li&gt;&lt;li style="text-align: justify;"&gt;Pentaho Kettle Solutions - Building Open Source ETL solutions with Pentaho Data Integration, by Matt Casters, Roland Bouman and Jos van Dongen (&lt;a href="http://kjube.blogspot.com/2010/10/pentaho-kettle-solutions-overview.html"&gt;extensively discussed in a previous post&lt;/a&gt;) is available since september 2010.&lt;/li&gt;&lt;/ul&gt;&lt;div style="text-align: center;"&gt;&lt;div style="text-align: left;"&gt;&amp;nbsp; &lt;a href="https://www.packtpub.com/pentaho-32-data-integration-beginners-guide/book"&gt;&lt;img border="0" height="200" src="http://static.letsbuyit.com/filer/images/uk/products/original/187/24/pentaho-3-2-data-integration-beginner-s-guide-18724176.jpeg" width="161" /&gt;&lt;/a&gt;&amp;nbsp;&lt;a href="http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177"&gt;&lt;img border="0" height="200" src="http://www.lybrary.com/images/0470942428.jpg" width="158" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;And if that is not enough, another book is in the making.&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;a href="https://www.packtpub.com/authors/profiles/maria-carina-roldan"&gt;María Carina Roldán&lt;/a&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&lt;/span&gt;has teamed up with &lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;a href="https://www.packtpub.com/authors/profiles/adri%C3%A1n-sergio-pulvirenti"&gt;Adrián Pulvirenti&lt;/a&gt; to deliver the "&lt;a href="https://www.packtpub.com/pentaho-data-integration-4-cookbook/book"&gt;Pentaho Data Integration 4: Cookbook&lt;/a&gt;" (again with publishing house &lt;a href="https://www.packtpub.com/"&gt;PACKT&lt;/a&gt;).&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;The book will deliver a series of 'recipes' to solve typical (and less typical) kettle riddles, divided over 9 chapters with the following titles.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;/div&gt;&lt;ol&gt;&lt;li&gt;Working with Databases&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/li&gt;&lt;li&gt;Reading and Writing Files&lt;/li&gt;&lt;li&gt;Manipulating XML Structures&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/li&gt;&lt;li&gt;File Management&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/li&gt;&lt;li&gt;Looking up for Data&lt;/li&gt;&lt;li&gt;Understanding How Data Flows&lt;/li&gt;&lt;li&gt;Executing and Reusing Jobs and Transformations&lt;/li&gt;&lt;li&gt;Integration with Pentaho Suite&lt;/li&gt;&lt;li&gt;Some More Useful Recipes&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;I have had the pleasure revising chapters 1 to 7 up to now, and will soon read chapters 8 and 9 which are in the writing.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;My impression so far is that the analogy with a cookbook is actually well choosen.The recipes are organised in logical groups like you would have starters, rice dishes, deserts etc in a cookbook, but still the recipes within a same category, though logically related can be very different, like vanilla pudding and chocolate mouse. &amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;Maybe not all of them may be to your liking, as each recipe tackles a different problem. &amp;nbsp;And some people just prefer chocolate over strawberry. But having said that I'm not saying you shouldn't get the book because there might be some recipes you don't like or don't want to use. Actually, while reviewing the book, I found all the recipes interesting. And even if there are a few I won't use to literally cook the dish Maria and Adrian sugges, I will use the ideas I picked up in them to flavor my own dishes.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;You can already get the book in RAW format from PACKT right now, or way till the end of april until all chapters are complete and the book becomes available.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-3787743112724010749?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/3787743112724010749/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/03/another-kettle-book-on-its-way.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3787743112724010749'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3787743112724010749'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/03/another-kettle-book-on-its-way.html' title='Another kettle book on its way ...'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-3562070005395346215</id><published>2011-03-08T10:30:00.003+01:00</published><updated>2011-03-08T10:31:48.522+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>Blog for mobile devices</title><content type='html'>We noticed that about 5% of our blog visits are from mobile devices. This percentage has been increasing steadily over the last year.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="https://lh3.googleusercontent.com/-2yZQXFX5GgE/TXX2rGex1gI/AAAAAAAAAio/a6Cbfh5RK-A/s1600/screenshot005.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="https://lh3.googleusercontent.com/-2yZQXFX5GgE/TXX2rGex1gI/AAAAAAAAAio/a6Cbfh5RK-A/s640/screenshot005.png" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Since november or december 2010, Google Blogger (finally) supports adapted lay-outs for visualization of blog pages on mobile devices. Given the fact that you are using mobile devices to look at our site, we have enabled this feature on our blog. We hope it will improve the readability of our blog while you are on the move.&lt;/div&gt;&lt;br /&gt;What you should see when you go to our blog on your mobile device is the following:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="https://lh6.googleusercontent.com/-LzsyJI0aWH0/TXX03o_yAfI/AAAAAAAAAik/-Ds6RS8VZNk/s1600/screenshot004.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="https://lh6.googleusercontent.com/-LzsyJI0aWH0/TXX03o_yAfI/AAAAAAAAAik/-Ds6RS8VZNk/s320/screenshot004.png" width="220" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Enjoy the reading. And don't hesitate to leave your comments.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-3562070005395346215?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/3562070005395346215/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/03/blog-for-mobile-devices.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3562070005395346215'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3562070005395346215'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/03/blog-for-mobile-devices.html' title='Blog for mobile devices'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh3.googleusercontent.com/-2yZQXFX5GgE/TXX2rGex1gI/AAAAAAAAAio/a6Cbfh5RK-A/s72-c/screenshot005.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6782037713165082405</id><published>2011-03-08T10:01:00.001+01:00</published><updated>2011-03-08T10:01:22.474+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Database development'/><title type='text'>Connect as sysdba</title><content type='html'>Sometimes, when working with Oracle database, you need to connect as sysdba or sysoper.&lt;br /&gt;&lt;blockquote&gt;A user must connect&amp;nbsp;AS SYSDBA&amp;nbsp;or&amp;nbsp;AS SYSOPER&amp;nbsp;if he wants to perform one of the tasks that require&amp;nbsp;&lt;a href="http://www.adp-gmbh.ch/ora/admin/system_privileges.html#sysdba"&gt;&lt;span class="Apple-style-span" style="color: black;"&gt;sysdba&lt;/span&gt;&lt;/a&gt;&amp;nbsp;or&amp;nbsp;&lt;a href="http://www.adp-gmbh.ch/ora/admin/system_privileges.html#sysoper"&gt;&lt;span class="Apple-style-span" style="color: black;"&gt;sysoper&lt;/span&gt;&lt;/a&gt;&amp;nbsp;&lt;a href="http://www.adp-gmbh.ch/ora/misc/users_roles_privs.html"&gt;&lt;span class="Apple-style-span" style="color: black;"&gt;privileges&lt;/span&gt;&lt;/a&gt;(such as to&amp;nbsp;&lt;a href="http://www.adp-gmbh.ch/ora/sqlplus/shutdown.html"&gt;&lt;span class="Apple-style-span" style="color: black;"&gt;shutdown&lt;/span&gt;&lt;/a&gt;&amp;nbsp;or to&amp;nbsp;&lt;a href="http://www.adp-gmbh.ch/ora/sqlplus/startup.html"&gt;&lt;span class="Apple-style-span" style="color: black;"&gt;startup&lt;/span&gt;&lt;/a&gt;&amp;nbsp;an&amp;nbsp;&lt;a href="http://www.adp-gmbh.ch/ora/admin/instance.html"&gt;&lt;span class="Apple-style-span" style="color: black;"&gt;instance&lt;/span&gt;&lt;/a&gt;. If he connects as SYSDBA, he becomes SYS, if he connects as SYSOPER, he becomes PUBLIC.&lt;/blockquote&gt;&lt;div&gt;On the command line, using sql-plus this is pretty simple. Just add "as sysdba" to your connect statement.&lt;/div&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="color: black;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;c&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;onnect&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt; sys/my_secret_password &lt;b&gt;as sysdba&lt;/b&gt;&lt;/span&gt;&lt;/blockquote&gt;However when you are working with a graphical user interface logging in as sysdba isn't that evident. Basically because the login box that you get presented just allows you to insert a username and a password. So where goes the "as sysdba" part?&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;i&gt;Using SQL Developer&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Just add the "as sysdba" statement after your username. I needed this not so long ago, I luckily found it after some Googl-ing. My sql-plus isn't up to par, so really being able to use a GUI is really a time saver.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-PGTWh8gQy5Y/TWYsVPV6BJI/AAAAAAAAAhQ/YOV1xG-muKI/s1600/screenshot001.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="90" src="http://2.bp.blogspot.com/-PGTWh8gQy5Y/TWYsVPV6BJI/AAAAAAAAAhQ/YOV1xG-muKI/s320/screenshot001.png" style="cursor: move;" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;b&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;/b&gt;&lt;/div&gt;&lt;b&gt;&lt;i&gt;Using Quest Toad&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;If you are using Toad, the answer is fairly simple. They have provided a little drop down list called "connect as" which allows you to select between "normal", "sysdba" and "sysoper".&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://filedb.experts-exchange.com/incoming/2010/03_w12/271533/login.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://filedb.experts-exchange.com/incoming/2010/03_w12/271533/login.jpg" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6782037713165082405?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6782037713165082405/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/03/connect-as-sysdba.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6782037713165082405'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6782037713165082405'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/03/connect-as-sysdba.html' title='Connect as sysdba'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-PGTWh8gQy5Y/TWYsVPV6BJI/AAAAAAAAAhQ/YOV1xG-muKI/s72-c/screenshot001.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-216734875208994519</id><published>2011-03-08T07:47:00.003+01:00</published><updated>2011-03-08T07:57:08.930+01:00</updated><title type='text'>AD/AC</title><content type='html'>&lt;div&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;img src="http://lh4.ggpht.com/_C0PnWJwDRZY/TXXRAad925I/AAAAAAAAAig/DcmBMsbixRI/1299566672692.png" width="580" /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-216734875208994519?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/216734875208994519/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/03/adac.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/216734875208994519'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/216734875208994519'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/03/adac.html' title='AD/AC'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh4.ggpht.com/_C0PnWJwDRZY/TXXRAad925I/AAAAAAAAAig/DcmBMsbixRI/s72-c/1299566672692.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-2024000672472921337</id><published>2011-03-08T07:42:00.064+01:00</published><updated>2011-03-08T08:24:46.901+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>Mobile blogging</title><content type='html'>&lt;i&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;This blog post wasn't written entirely on the Mobile Blogger app by Google. Some editing has been done afterwards using the standard Blogger webinterface. All editing done afterwards is listed out below.&lt;/span&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Google finally released a &lt;a href="https://market.android.com/details?id=com.google.android.apps.blogger"&gt;mobile app for Blogger&lt;/a&gt;. Since I happen to travel a lot the following days, I thought to give the app a try, just to kill time on the road and getting ready for the next PCG ;-)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://ssl.gstatic.com/android/market/com.google.android.apps.blogger/ss-1-320-480-160-0-c3af2b4b7331671b6d0856a60727b131561cd253" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="https://ssl.gstatic.com/android/market/com.google.android.apps.blogger/ss-1-320-480-160-0-c3af2b4b7331671b6d0856a60727b131561cd253" width="192" /&gt;&lt;/a&gt;&lt;a href="https://ssl.gstatic.com/android/market/com.google.android.apps.blogger/ss-0-320-480-160-0-a02999a8034dd533a3c776dbe3074d860e1a6227" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="https://ssl.gstatic.com/android/market/com.google.android.apps.blogger/ss-0-320-480-160-0-a02999a8034dd533a3c776dbe3074d860e1a6227" width="192" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;First of all let me say I'm a fan of the available &lt;a href="https://market.android.com/developer?pub=Google+Inc."&gt;Google mobile apps&lt;/a&gt;. &lt;a href="https://market.android.com/details?id=com.google.android.gm"&gt;Gmail&lt;/a&gt; on Android is excellent. &lt;a href="https://market.android.com/details?id=com.google.android.apps.maps"&gt;Mobile Maps&lt;/a&gt; is beyond excellent. &lt;a href="https://market.android.com/details?id=com.google.android.stardroid"&gt;Skymaps&lt;/a&gt; is ultra cool and though I hardly use &lt;a href="https://market.android.com/details?id=com.google.android.youtube"&gt;YouTube&lt;/a&gt; on my computer, I started using it on Android. Needless to say I have expectations for this Mobile Blogger app.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;From the moment you fire up the app, you are connected to your Google account. Mobile single sign-on, nice. And the app will present you with your list of blogs. So you simple pick the right blog from the list and ... action.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Creating a new mobile post and edit it with Mobile Blogger is pretty intuitive. Just hit the pencil button on the main screen and you are ready to go. There is a first text box for the title and a second for your post. Easy enough to start a new post, for instance while you are standing in a cue to check in your luggage in &lt;a href="http://www.ciampino-airport.info/"&gt;Ciampino airport&lt;/a&gt;.&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;img border="0" src="http://lh4.ggpht.com/_C0PnWJwDRZY/TXXPuP52NeI/AAAAAAAAAic/cQ3uYwAjojw/IMAG0934.png" width="580" /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;At the bottom of the screen there is a publish/save button, which allows you to publish your post or save a draft, same as in the web version of Blogger. I believe saving isn't done automatically (which is the case in the web interface). A bit annoying, as you will have to remember to regularly save your work. Seems like MS Office 3 dot something all over.&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Next to the pencil button, there is a button to see you list of mobile posts. It is important to know that that list will only show mobile posts (published and draft versions). Any drafts or posts you may have made on your pc, you will not see in this list. So forget about quickly adding a few lines to a post on which you were working from home. You just don't have access to those. And vice versa, you don't see your mobile posts in the web interface. So there is zero interoperability. A missed opportunity if you asks me. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Under the main text box, you find two buttons to add images. By now I can already tell you that you will probably use these more then actually writing text. An Android phone just isn't theft right device for massive text input. Also since you cannot lay-out (as in using bold, italics, underline, different fonts, ...) your text in Mobile Blogger app, creating a lot of text doesn't make sense. You have to few means to control the readability. So adding images will be what you'll do most using this app.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The first button allows you to take a snap and insert that right away in the text. Here's the boring view from my train seat.&amp;nbsp;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;img heigth="580" src="http://lh4.ggpht.com/_C0PnWJwDRZY/TXXPiRD12cI/AAAAAAAAAiQ/h849vxo6RXQ/1299518874008.png" /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;After you'be taken the picture the screen goes black for about a second or two. What the app does in that moment isn't clear. I would guess uploading the image.&amp;nbsp;The second image button allows you to pick a picture from you Android's media gallery. Again, after selecting the image the app gets sluggish for a second or so.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The pictures you add will appear next to the buttons, and not in the text. This way you don't know where your pictures will appear in your text. And since there is no preview feature I have no clue while writing this where the picture of my metro ride will appear. Not really very user friendly.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Just below the picture bar there is a line to type your tags. Unfortunately you will have to remember your usual tags as the app doesn't seem to know my tag list. Again, a bit annoying as I'd rather keep my tag list nice and clean.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;And finally just above the publish/save buttons, there is a line which allows you to geotag your post. Since you can only add one single location, this seems to indicate you aren't supposed to write long posts, meaning if you are writing on this Blogger app for over two hours on a high speed train, your location changes regularly, which isn't supported. &lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;All in all I must say I'm not overly impressed with this app. It is hardly a step up from the &lt;a href="http://www.blogger.com/tour_pst.g"&gt;SMS/MMS and email posting&lt;/a&gt;&amp;nbsp;that has been available for years. Sure I can tag and geo-tag my posts now, but what extra functionaries are there. In general it stems Blogger keeps lagging behind on other blog software and services, also in the mobile field. Hopefully getting a first version on the market, will be the beginning of some rapid improvements. The feedback form included in the app would seems to indicate Google wants listen to the users. I really hope they make something of this app.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;PS: While walking past the tram stop close to my place, I noticed the light boards indicating the position and ETA of the trams finally became operational. Those board have been hanging there for over a year without lighting up (except for a short period where they showed the hour). Let us pray there is no analogy with Mobile Blogger.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;img src="http://lh5.ggpht.com/_C0PnWJwDRZY/TXXPjfJe4UI/AAAAAAAAAiU/Xg20uJHbiOs/IMAG1052.png" width="580" /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;u&gt;Editing done afterwards:&lt;/u&gt;&lt;/b&gt;&lt;/div&gt;The following editing of the text has been done using the standard Blogger interface.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;All pictures which are added using Mobile Blogger appear at the bottom of your post. I move all images to the right place in the text, resized them and centered them.&lt;/li&gt;&lt;li&gt;I changed the appearance of all the text to 'justify' and added a blank line between all paragraphs as the text wasn't really readable otherwise.&lt;/li&gt;&lt;li&gt;I put in the wrong label, so I updated that.&lt;/li&gt;&lt;li&gt;I've added some links to the text. I didn't figure out whether that could be done using the app.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;b&gt;&lt;u&gt;Conclusion:&lt;/u&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;As the app is today, I don't suggest using it for any serious mobile blogging. It is good to quickly upload a picture with some text and geo-tag it. Beyond that, the app will just let you down.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-2024000672472921337?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/2024000672472921337/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/03/mobile-blogging.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2024000672472921337'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2024000672472921337'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/03/mobile-blogging.html' title='Mobile blogging'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh4.ggpht.com/_C0PnWJwDRZY/TXXPuP52NeI/AAAAAAAAAic/cQ3uYwAjojw/s72-c/IMAG0934.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-5772533470740194836</id><published>2011-02-24T11:00:00.001+01:00</published><updated>2011-03-08T09:59:47.074+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Database development'/><title type='text'>Kill or protect the Toad?</title><content type='html'>&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;b&gt;A short story about closed source killing real open source and commercial open source killing closed source.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you have every come close to an &lt;a href="http://www.oracle.com/us/products/database/index.html"&gt;Oracle datapond&lt;/a&gt;&amp;nbsp;in - let's say - the last 15 years or so, then you must have seen the &lt;a href="http://en.wikipedia.org/wiki/Toad"&gt;Toad&lt;/a&gt; sitting somewhere close to the pond. If you don't recall seeing any of these little animals, the picture provided here shows what they look like. They are extremely attracted to Oracle ponds and the fisherman surrounding it. So really, you must have seen them. And if you haven't seen them you most definitely &lt;a href="http://www.naturenorth.com/spring/sound/shfrsnd.html"&gt;must have heard them&lt;/a&gt;, because they make a specific noise when waking up.&lt;br /&gt;&lt;a href="http://www.orafaq.com/wiki/images/4/49/Toad_frog.gif" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img align="right" border="0" src="http://www.orafaq.com/wiki/images/4/49/Toad_frog.gif" width="100" /&gt;&lt;/a&gt;&lt;br /&gt;The Toad stams from the family of the &amp;nbsp;&lt;a href="http://www.quest.com/products/"&gt;Quest&lt;/a&gt;us Questionalibus. Now a very interesting fact about this Questus family is that they are known to be a very dominant species who are very protective of their &lt;a href="http://en.wikipedia.org/wiki/Proprietary_software"&gt;extremely closed ecosphere&lt;/a&gt;&amp;nbsp;(sometimes called the orasphere). Many assume that their dominance just stems from the natural 'survival of the fittest' paradigm, but there is a large base of scientists and biologists that believe the Toad dominance in the orasphere is unnatural.&lt;br /&gt;&lt;br /&gt;The Toad have an very clever way of protecting their environment. One way they cope with genetic competition is, if a related species shows up, they will first invite it into their family and mate with it. After this first phase of hospitality however they will slowly take over some of the other animal's genes. Also once the animal is invited into the Quest ecosphere and the initial gene exchange has been accomplished, the poor newcomer will be excluded from any sexual activity, thus condemning it to extinction.&amp;nbsp;This happened about 6 or 7 years ago some years ago with an small species call the &lt;a href="http://sourceforge.net/projects/tora/"&gt;Tora&lt;/a&gt;. Shortly after the &lt;a href="http://www.cuddletech.com/articles/oracle/node75.html"&gt;Tora was invited into the orasphere&lt;/a&gt;&amp;nbsp;somehere during 2004, it's sexual &lt;a href="http://sourceforge.net/mailarchive/forum.php?forum_name=tora-announce"&gt;activity dropped to zero&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://blogspawn.files.wordpress.com/2009/04/toad.jpg?w=450&amp;amp;h=303" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://blogspawn.files.wordpress.com/2009/04/toad.jpg?w=450&amp;amp;h=303" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Interventions from animal protection groups didn't help much. Up to today, the poor Tora species is barely alive. And it's genetic development has come to a near stop while the Toad species has flourished.&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Over the last few years however a new evolution has taken place. The &lt;a href="http://en.wikipedia.org/wiki/Larry_Ellison"&gt;owners of the Oracle dataponds&lt;/a&gt;&amp;nbsp;have been watching the rapid growth rate of the Quest frog with great attention. Scared by the effects that a too dominant species could have on the pond's ecosphere, they decided to introduce their own species, with the scientific name of &lt;a href="http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html"&gt;Oracle SQLD&lt;/a&gt;.&amp;nbsp;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;script src="http://www.gmodules.com/ig/ifr?url=http%3A%2F%2Fwww.google.com%2Fig%2Fmodules%2Fgoogle_insightsforsearch_interestovertime_searchterms.xml&amp;amp;up__property=empty&amp;amp;up__search_terms=quest+toad%7Coracle+sql+developer&amp;amp;up__location=empty&amp;amp;up__category=0&amp;amp;up__time_range=empty&amp;amp;up__compare_to_category=false&amp;amp;synd=open&amp;amp;w=580&amp;amp;h=400&amp;amp;lang=en-US&amp;amp;title=Google+Insights+for+Search&amp;amp;border=%23ffffff%7C3px%2C1px+solid+%23999999&amp;amp;output=js" type="text/javascript"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;This genetically engineered animal has genes which are specifically adapted to the Oracle pond environment, thus making it a competitor that has sufficient gene strenght to survive against the Toad. Additionally the also made de gene structure of their frog so open that it could adapt whatever gene code from the outside in order to allow it rapid evolution. An amazing point of strenght for this little frogger.&lt;br /&gt;&lt;br /&gt;Additonally the pond owners clearly supervise the population carefully avoiding any mating between the species as this would not be in the interest of the pond owner. The effects were and are amazing. The Toad population gradually dropped over the last few years, while the SQLD's took over the house, as shown by the graph below.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Clearly this case learns us a few things&lt;/div&gt;&lt;ul&gt;&lt;li style="text-align: justify;"&gt;First of all, sometimes a species needs some support from us humans to actually break a natural dominance of a specific species that tries to close and dominate the ecosphere. I believe everyone agrees that opening up the orasphere is a good thing for natural evolution&lt;/li&gt;&lt;li style="text-align: justify;"&gt;A second of all is, how far do we want to go with this human intervention. Clearly some diversity in the orasphere was a good thing, but where we seem to be heading now, is to wiping out the Toad species and letting the pond owner decide how the ecosphere should look like. Is that a good way to go?&lt;/li&gt;&lt;/ul&gt;&lt;div style="text-align: justify;"&gt;To end this article, I want to underline that I'm not a biologist. I just have an interest in the Oracle ecosphere as I sometimes go fishing there. I though it would be interesting to write down my observations and see what other people think of this evolution of species. So I'm looking forward to the comments of all the frog lovers out there!&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-5772533470740194836?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/5772533470740194836/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/03/killing-toad.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5772533470740194836'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5772533470740194836'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/03/killing-toad.html' title='Kill or protect the Toad?'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7485312860216266836</id><published>2011-02-14T14:32:00.000+01:00</published><updated>2011-02-14T14:32:38.330+01:00</updated><title type='text'>Slowly changing dimensions slowly appearing</title><content type='html'>&lt;div style="text-align: justify;"&gt;A short time ago, I came across this &lt;a href="http://toddmcdermid.blogspot.com/2010/01/kimball-method-slowly-changing.html"&gt;blogpost&lt;/a&gt; written by &lt;a href="http://toddmcdermid.blogspot.com/"&gt;Todd McDermid&lt;/a&gt; regarding his &lt;b&gt;&lt;a href="http://kimballscd.codeplex.com/"&gt;Kimball Method Slowly Changing Dimension Component&lt;/a&gt;&lt;/b&gt; for &lt;a href="http://msdn.microsoft.com/en-us/library/ms141026.aspx"&gt;Microsoft Sequal Server Integration Services&lt;/a&gt; (MSSSIS). The first version of this component has been release somewhere during 2008, as an open source contribution to MSSSIS and has grown tremendously popular since then.&amp;nbsp;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Now, anyone who has ever built a data warehouse using the &lt;a href="http://www.ralphkimball.com/"&gt;Ralph Kimball &lt;/a&gt;approach , will not be amazed by the popularity of such a component, as &lt;a href="http://en.wikipedia.org/wiki/Slowly_changing_dimension"&gt;slowly changing dimensions&lt;/a&gt; are an &lt;b&gt;quintessential element of &lt;a href="http://en.wikipedia.org/wiki/Dimensional_modeling"&gt;dimensional modeling&lt;/a&gt;&lt;/b&gt;, explained by Kimball in his very first writings on dimensional modeling. (See: "&lt;a href="http://www.rkimball.com/html/articles_search/articles1997/9708d15.html"&gt;A dimensional modeling manifesto&lt;/a&gt;", DBMS magazine, 1997) &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Those among you who know MSSSIS a bit,might know that Todd's component &lt;a href="http://microsoft-ssis.blogspot.com/2011/01/slowly-changing-dimension-alternatives.html"&gt;isn't the only way&lt;/a&gt; to achieve slowly changing dimensions using this Microsoft tool. Some &lt;b&gt;alternatives&lt;/b&gt; are available.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;First of all there is a slowly changing dimension &lt;b&gt;wizard&lt;/b&gt; available in MSSSIS, which will generate a slowly changing dimension for you. Unfortunately this wizard has some downsides:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Obviously a code generation wizard has the downside that, if you do any tweaking on the generated code, all your edits will be overwritten in case you should rerun the wizard. Since the coding required to do slowly changing dimensions can be relatively complex - otherwise why would you need the wizard - that is a true downside. &lt;/li&gt;&lt;li&gt; Another downside of the wizard seems to be the performance. The generated code just isn't performing very well. And that is again a real shame, because data warehousing usually deals with large volumes of data. Actually, if you wouldn't have large volumes, you probably wouldn't even need a data warehouse, let alone a slowly changing dimension. &lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Secondly, if you don't want to use the wizard, you can always revert to &lt;b&gt;writing the whole logic for implementing a SCD by yourself,&lt;/b&gt; using the existing MSSSIS components or some smart scripting, an the &lt;a href="http://technet.microsoft.com/en-us/library/bb510625.aspx"&gt;T-SQL merge&lt;/a&gt; (oh, only available since SQL Server 2008) I don't even need to argue for this case. If the logic to implement a SCD is so complex that you would want a wizard to generate it, why would you even want to write it manually.&lt;/li&gt;&lt;/ul&gt;So, in the end, maybe &lt;b&gt;Todd's way seems to be the only right way&lt;/b&gt; to handle the implementation of slowly changing dimensions. One single component that handles it all, without any need for you to worry. And that conclusion brings me to the real question of this blog post?&amp;nbsp;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;Why would Microsoft want to release a data integration tool without any decent support for SCD? And why is it up to an open source initiative to actually fill that gap? &lt;/b&gt;MSSSIS was introduced with the release of SQL Server 2005, yet users had to live with a (buggy) wizard until Todd released a SCD component in september 2008. That is &lt;b&gt;three years&lt;/b&gt; before a decent and working alternative became available. With the popularity that data warehousing has enjoyed in the last decenium, it is amazing that MSSSIS developers haven't marched down to &lt;a href="http://www.microsoft.com/presspass/insidefacts_ms.mspx"&gt;Redmond&lt;/a&gt; to slice up the SQL Server development team using the &lt;a href="http://en.wikipedia.org/wiki/Windows_Genuine_Advantage"&gt;genuine&lt;/a&gt; installation disks.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Having experience with ETL and data integration tools from likes of Informatica, IBM, Oracle, ... I cannot help by notice that &lt;b&gt;Microsoft isn't a stand alone case&lt;/b&gt;. Most of the data integration vendors have been ignoring proper support for slowly changing dimensions, a concept that has been around for about 15 years now. Informatica Powercenter offers &lt;a href="http://www.informatica.com/products_services/powercenter/editions/standard_edition/Pages/standard_edition_new.aspx"&gt;up to today&lt;/a&gt; only a wizard to implement SCD. IBM Datastage has included support for SCD since the &lt;a href="http://www.ibm.com/developerworks/data/tutorials/dm-0903datastageslowlychanging/index.html"&gt;release of Infosphere Datastage in 2009&lt;/a&gt;. How different from an open source product like &lt;a href="http://kettle.pentaho.com/"&gt;kettle&lt;/a&gt; (aka Pentaho Data Integration) that included already in its very first release a &lt;a href="http://wiki.pentaho.com/display/EAI/Dimension+Lookup-Update"&gt;SCD step&lt;/a&gt;. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;ul&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;Large data integration vendors, please hear me. Why o why is it,&amp;nbsp; that we can expect support for SCD from open source initiatives but not from you?&amp;nbsp; Slowly changing dimensions are as elementary to data warehousing as the 'CREATE' statement to a relational database. Wake up! And start delivering!&lt;/b&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7485312860216266836?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7485312860216266836/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/02/slowly-changing-dimensions-slowly.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7485312860216266836'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7485312860216266836'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/02/slowly-changing-dimensions-slowly.html' title='Slowly changing dimensions slowly appearing'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-119170093605393915</id><published>2011-02-10T23:12:00.000+01:00</published><updated>2011-02-10T23:12:57.586+01:00</updated><title type='text'>The box maybe virtual but I'm still stuck with Windows</title><content type='html'>&lt;div style="text-align: justify;"&gt;Running a Windows Machine in &lt;a href="http://www.virtualbox.org/"&gt;Virtual Box&lt;/a&gt; Seamless mode on top of a &lt;a href="http://www.kubuntu.org/"&gt;Kubuntu&lt;/a&gt;, can look pretty confusing. I mean, Windows security center right next to KPackagekit and Dolphin?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TUhA97l5CfI/AAAAAAAAAgw/nLdGeXZks6c/s1600/desktop004.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TUhA97l5CfI/AAAAAAAAAgw/nLdGeXZks6c/s400/desktop004.png" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: justify;"&gt;Anyhow, one of the few reasons I sometimes need to revert to Windows is to run either &lt;a href="http://www.ca.com/us/products/detail/CA-ERwin-Data-Modeler.aspx"&gt;CA Erwin&lt;/a&gt; of &lt;a href="http://www.sybase.be/products/modelingdevelopment/powerdesigner"&gt;Sybase PowerDesigner&lt;/a&gt;. It just seems to be the case that there aren't any fully featured data modelling tools for Linux available.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TUhBUirjxuI/AAAAAAAAAg0/LvHQSrSd0ng/s1600/desktop001.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TUhBUirjxuI/AAAAAAAAAg0/LvHQSrSd0ng/s400/desktop001.png" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And of course. &lt;a href="http://kff.kjube.be/"&gt;KFF&lt;/a&gt; needs to be tested on Windows. It seems there are people using kettle on windows.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-119170093605393915?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/119170093605393915/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/02/box-maybe-virtual-but-im-still-stuck.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/119170093605393915'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/119170093605393915'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/02/box-maybe-virtual-but-im-still-stuck.html' title='The box maybe virtual but I&apos;m still stuck with Windows'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/TUhA97l5CfI/AAAAAAAAAgw/nLdGeXZks6c/s72-c/desktop004.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-1605777213480817245</id><published>2011-02-04T00:06:00.000+01:00</published><updated>2011-02-04T00:06:25.116+01:00</updated><title type='text'>Customer service from my mobile provider</title><content type='html'>&lt;table style="text-align: justify;"&gt;&lt;tbody&gt;&lt;tr&gt; &lt;td&gt;&lt;img align="texttop" border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TUhE5vRGnHI/AAAAAAAAAg4/qo_KG6wNHG4/s320/P1050487.JPG" width="196" /&gt;&lt;/td&gt; &lt;td valign="top"&gt;&lt;div style="text-align: justify;"&gt;This blog post contains total non information. So please, move on if you have better things to do. I warned you.&lt;br /&gt;&lt;br /&gt;On the other hand, if you are bothered with the service level offered by the help desk of your mobile provider, have a look at this picture from my Android phone and know that you are not alone. &lt;br /&gt;&lt;br /&gt;&lt;/div&gt;Seeing is believing! Yes indeed!&amp;nbsp;After I had - for the fourth time in four months - a problem with my invoice, I called my mobile provider, know as &lt;a href="http://www.learntarot.com/bigjpgs/maj15.jpg"&gt;Proximus&lt;/a&gt; in order to have the error corrected. Alas, I never got through to them. I gave up after 1 hours and 5 minutes of listening to their promo music.&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px;"&gt;If you think you can do better, below you'll find the music that Proximus (ab)uses - for the record, I like(d) Wim Mertens - for the people stupid enough to remain on hold. Please try to listen to this tune for over an hour. Good luck.&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;iframe allowfullscreen="" frameborder="0" height="345" src="http://www.youtube.com/embed/LvZQOYzycVA?rel=0" title="YouTube video player" width="560"&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-1605777213480817245?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/1605777213480817245/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/02/customer-service-from-my-mobile.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1605777213480817245'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1605777213480817245'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/02/customer-service-from-my-mobile.html' title='Customer service from my mobile provider'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/TUhE5vRGnHI/AAAAAAAAAg4/qo_KG6wNHG4/s72-c/P1050487.JPG' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-2676427369078294451</id><published>2011-01-31T17:02:00.000+01:00</published><updated>2011-01-31T17:02:43.986+01:00</updated><title type='text'>Magical indeed</title><content type='html'>&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Gartner have just released the magic BI quadrant for Januari 2011. &lt;b&gt;Magic indeed&lt;/b&gt;, as some things that are going on must be based on magic rather than on objective criteria.&lt;/span&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Januari 2009&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;In Januari 2009, Gartner released a magic BI quadrant. Pentaho doesn't make it to this magical quadrant due to revenue requirements. Gartner added the following comment to their magical quadrant text:&lt;/span&gt;&lt;/div&gt;&lt;blockquote style="text-align: justify;"&gt;“However, while &lt;b&gt;they don’t meet the revenue requirement&lt;/b&gt;, Jaspersoft and Pentaho have emerged as viable players in the BI platform market and as such we invited these firms to take part in the Magic Quadrant user survey. Both open source vendors provide comprehensive BI platform capabilities that are comparable to traditional BI platform vendors. A key part of both vendors’ strategy is to forge OEM relationships with commercial independent software vendors (ISVs) looking to easily embed BI functionality at a low price point. Jaspersoft and Pentaho enable ISVs to OEM open-source BI components without being bound by the GNU General Public License (GPL) terms and conditions. Given their subscription-based model, both vendors need to provide exemplary support. This was in evidence in the MQ reference survey, as both Jaspersoft and particularly Pentaho scored strongly on the customer support question — higher than any of the megavendors.” Source: &lt;a href="http://sherlockinformatics.com/wordpress/business-intelligence-technology/pentaho-receives-attention-from-gartner"&gt;Sherlockinformatics&lt;/a&gt;&lt;/blockquote&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;b&gt;Januari 2010&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;In Januari 2010, Pentaho doesn't make it to the magic quadrant. Gartner explains why:&lt;/span&gt;&lt;br /&gt;&lt;blockquote style="text-align: justify;"&gt;"Beyond the emerging vendors, Gartner gave serious consideration, as it did last year, to including open-source BI suppliers in the Magic Quadrant. While this year, &lt;b&gt;both major open-source BI platform suppliers generated enough revenue to be included in the Magic Quadrant, they did not garner enough customer survey responses&lt;/b&gt;. Although they did not meet the references requirement, Jaspersoft and Pentaho have emerged as viable players in the BI platform market. Both open-source vendors provide comprehensive BI platform capabilities that are comparable in many functional areas with those of traditional BI platform vendors. A key part of both vendors' strategy is to forge OEM relationships with commercial independent software vendors (ISVs) looking to easily embed BI functionality at a low price point. Jaspersoft and Pentaho enable ISVs to embed their open-source BI components without being bound by the GNU General Public License terms and conditions. Given their subscription-based model, both vendors need to provide exceptional support. This was reflected in the Magic Quadrant customer survey, as both Jaspersoft and Pentaho scored strongly on the customer support question — higher than any of the megavendors for the second year in a row. Source: &lt;a href="http://www.gartner.com/technology/media-products/reprints/microsoft/vol10/article7/article7.html"&gt;Gartner&lt;/a&gt;&lt;/blockquote&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;b&gt;Januari 2011&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;In Januari 2011 Pentaho again doesn't make the magic quadrant. Gartner comments.&lt;br /&gt;&lt;blockquote style="text-align: justify;"&gt;Pentaho garnered enough survey customer responses for inclusion on the Magic Quadrant, and it indicated to Gartner early in October 2010 that its 2010 BI platform revenue would meet or exceed $15 million. However, Pentaho recently informed Gartner that its growth in BI platform revenue (separate from its stand-alone extraction, transformation and loading [ETL] revenue) was slower than expected and thus represented a much smaller percentage of its overall revenue. &lt;b&gt;This resulted in Pentaho falling well below the survey revenue inclusion requirement of $15 million&lt;/b&gt; (this will be reflected in the upcoming 2Q11 report, "Market Share: Business Intelligence, Analytics and Performance Management Software, Worldwide, 2010"). Subsequently, Pentaho was excluded from the Magic Quadrant this year. However,&lt;b&gt; it did provide enough customer references to be included in the Magic Quadrant Customer Survey research notes that will publish in 1Q11&lt;/b&gt;. Source: &lt;a href="http://www.gartner.com/technology/media-products/reprints/microsoft/vol2/article15/article15.html"&gt;Gartner&lt;/a&gt;&lt;/blockquote&gt;&lt;blockquote style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;Remark: Gartner defines total software revenue as revenue that is generated from appliances, new licenses, updates, subscriptions and hosting, technical support, and maintenance. Professional services revenue and hardware revenue are not included in total software revenue.&lt;/span&gt;&lt;/blockquote&gt;&lt;b&gt;Enough sales in 2009 but not in 2010?&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;So if I read this right, according to Gartner, Pentaho&amp;nbsp;realized&amp;nbsp;enough revenue in 2009 (inclusion criteria: $15 million") but didn't realise enough revenue in 2010 (inclusion criteria: $15 million).&amp;nbsp;&lt;b&gt;&lt;i&gt;That would seem strange since Pentaho announced a 120% growth of bookings during the year 2010.&lt;/i&gt;&lt;/b&gt;&amp;nbsp;How is it possible they suddenly end up "well below the survey inclusion requirement of $15 million"?&amp;nbsp;&lt;b&gt;The whole thing makes me wonder about the decision criteria Gartner uses to compose their magical quadrant. How much do these analysts really play with the criteria to get a &lt;a href="http://en.wikipedia.org/wiki/Pizzo_(extortion)"&gt;wanted output&lt;/a&gt;.&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;/b&gt;Can some one enlighten me? Please?&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-2676427369078294451?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/2676427369078294451/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/01/magical-indeed.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2676427369078294451'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2676427369078294451'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/01/magical-indeed.html' title='Magical indeed'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-2935707733411948154</id><published>2011-01-31T14:06:00.000+01:00</published><updated>2011-01-31T14:06:09.195+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>50 ways to make your report</title><content type='html'>&lt;div style="text-align: justify;"&gt;Pentaho Business Intelligence offers a variety of ways to make reports. A large variety. Or should I say a &lt;i&gt;really very large variety&lt;/i&gt;. In this post I'm trying to list out some of the options as well as how to manage that diversity. Given the title of this post, I couldn't help but include the appropriate background music. So please, hit play, and read away.&lt;/div&gt;&lt;br /&gt;&lt;center&gt;&lt;br /&gt;&lt;iframe allowfullscreen="" class="youtube-player" frameborder="0" height="390" src="http://www.youtube.com/embed/MG-0BWLybIQ?rel=0" title="YouTube video player" type="text/html" width="580"&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;8 tools to get the job done&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;As a starter, I've tried to make a little drawing of the different tools included (or includable) in the Pentaho BI server. As you can see, &lt;b&gt;to make a simple report&lt;/b&gt;, you can already choose between 6 different reporting tools, namely:&lt;/div&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.pentaho.com/products/reporting/"&gt;Pentaho Reports&lt;/a&gt; (made with Pentaho Report Designer)&lt;/li&gt;&lt;li&gt;&lt;a href="http://community.pentaho.com/faq/waqr_faq.php"&gt;WAQR&lt;/a&gt;, the Web Ad-hoc Query Reporting tool, an online wizard to generate PRD reports.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.eclipse.org/birt/phoenix/"&gt;BIRT reports&lt;/a&gt; (made with Eclipse based reporting system: BIRT)&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.pentaho.com/products/analysis/"&gt;Pentaho Analyzer&lt;/a&gt; (LucidEra's ClearView product, acquired by Pentaho)&lt;/li&gt;&lt;li&gt;&lt;a href="http://wiki.pentaho.com/display/Reporting/JFreeReport+0.9"&gt;JFree Report&lt;/a&gt; (the Ad-hoc reporting engine Pentaho offered before Analyzer)&lt;/li&gt;&lt;li&gt;&lt;a href="http://code.google.com/p/pentahoanalysistool/"&gt;Saiku&lt;/a&gt; aka PAT.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;div style="text-align: justify;"&gt;And if that wasn't enough, I'm leaving out of the picture (literally) 2 tools for dashboarding, which also could easily by (ab)used to create a simple report.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.pentaho.com/products/dashboards/"&gt;Pentaho Dashboards&lt;/a&gt;&amp;nbsp;&amp;nbsp;and&lt;/li&gt;&lt;li&gt;the &lt;a href="http://wiki.pentaho.com/display/COM/Community+Dashboard+Framework"&gt;Community Dashboard Framework&lt;/a&gt;&amp;nbsp;and the Community Dashboard Editor by &lt;a href="http://www.webdetails.pt/"&gt;WebDetails&lt;/a&gt;&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;div style="text-align: justify;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TUaJUIubhbI/AAAAAAAAAgk/UIK8LNlQMxs/s1600/pic016.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: center;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TUaJUIubhbI/AAAAAAAAAgk/UIK8LNlQMxs/s640/pic016.png" width="580" /&gt;&lt;/a&gt;OK. So if we count also the dashboarding tools, we have 8 different tools to make a "report", that is by any Business Intelligence standard, a large choice. But there are more choices to make.&lt;/div&gt;&lt;br /&gt;&lt;b&gt;Endless ways to get to the data&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;If you decide to go for &lt;a href="http://www.ibridge.be/?p=171"&gt;&lt;b&gt;Pentaho Reporting&lt;/b&gt;&lt;/a&gt;, you will have to decide how to fetch your data. Again, there seems hardly any reason to lament about the number of choices at your disposal. Here is what you get.&lt;/div&gt;&lt;ul&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;JDBC&lt;/b&gt;: This allows you to define your own JDBC connection (or use an existing one) and manually write a &lt;a href="http://nl.wikipedia.org/wiki/SQL"&gt;SQL query &lt;/a&gt;that will be executed&amp;nbsp;against that connection.&amp;nbsp;&lt;/li&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;Metadata&lt;/b&gt;: This method will use the &lt;a href="http://wiki.pentaho.com/display/ServerDoc1x/Pentaho+Metadata+Editor"&gt;Pentaho Metadata Layer&lt;/a&gt; to access the data. &lt;a href="http://wiki.pentaho.com/download/attachments/9798174/metadata_domain_diagram.png?version=6&amp;amp;modificationDate=1181612018000"&gt;You don't get to see the database&lt;/a&gt;, but you'll have to use the Metadata Query Builder to generate an MQL (Metadata Query Language) query. (Quick start guide &lt;a href="http://diethardsteiner.blogspot.com/2009/07/pentaho-metadata-editor.html"&gt;here&lt;/a&gt;)&lt;/li&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;PDI&lt;/b&gt;: You can use a kettle transformation as a "data source" for your report. This of course opens up a again an endless series of options as PDI can use even your grandmother as a data source provided there is a JDBC driver available. That opens up data sources as: MS Access, MS Excel, flat files (fixed width and "something" separated), directory structures with file names, LDAP, Mondrian &amp;amp; OLAP, Salesforce data, SAP R/3 data, any SQL database with a JDBC driver, ... I guess we've made the point.&lt;/li&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;OLAP: &lt;/b&gt;This option allows you use an MDX query as a basis for your report.&lt;/li&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;XML&lt;/b&gt;: How about using an XML file as a basis for your report&lt;b&gt;&amp;nbsp;&lt;/b&gt;and defining your query against it here?&lt;/li&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;Advanced&lt;/b&gt;: Seemingly the people at Pentaho don't consider any of the above options 'advanced' enough, because under the advanced menu you'll find some more options to toy around with.&lt;/li&gt;&lt;ul&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;custom JDBC&lt;/b&gt; connection&lt;/li&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;scriptable data access&lt;/b&gt;: use beanshell, groovy, netrexx, javascript, xlst, jacl, jython&amp;nbsp;&lt;/li&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;(named) java method invocation&lt;/b&gt;&lt;/li&gt;&lt;li style="text-align: justify;"&gt;&lt;b&gt;external&lt;/b&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TUaUfuREdcI/AAAAAAAAAgo/Qq7skEJy_Ts/s1600/pic017.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TUaUfuREdcI/AAAAAAAAAgo/Qq7skEJy_Ts/s320/pic017.png" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: justify;"&gt;Personally, I haven't gotten round to using all of these methods, and though they intrigue me, I also hope I'll never have to use all of them. That is just too much to get my head around.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: justify;"&gt;If you decide to go for &lt;b&gt;Analyzer, JFreeReport or Saiku&lt;/b&gt;, your options are much more limited. Basically they all live on top of &lt;a href="http://mondrian.pentaho.com/"&gt;Pentaho Analysis Services&lt;/a&gt; aka&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Piet_Mondrian"&gt;Mondrian&lt;/a&gt;.&amp;nbsp;So your choices here would be simply to create an MDX query. The difference between creating an MDX query with the 3&amp;nbsp;fore-mentioned&amp;nbsp;tools and Pentaho Report Designer, is that these tools have a nice GUI to create the MDX for you (drag and drop or point and click).&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: justify;"&gt;When using BIRT reporting, you get a series of options that are closer to PRD again. I haven't listed all the features out, but they described &lt;a href="http://www.eclipse.org/birt/phoenix/intro/"&gt;here&lt;/a&gt;. The &lt;a href="http://download.eclipse.org/birt/downloads/examples/misc/BIRT2.1Demo/EclipseDemo.html"&gt;BIRT online demo&lt;/a&gt; also shows clearly how BIRT works.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;Intermezzo&lt;/b&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: justify;"&gt;Depending on your reading speed, I believe your song must be finished by now, so maybe give this version a try.&lt;/div&gt;&lt;br /&gt;&lt;center&gt;&lt;br /&gt;&lt;iframe allowfullscreen="" class="youtube-player" frameborder="0" height="390" src="http://www.youtube.com/embed/aVXX6NFpcT8?rel=0" title="YouTube video player" type="text/html" width="580"&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;The very best of&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;So why am I writing all of this out? Well, first of all, many customers don't understand the flexibility they have at their disposal when working with Pentaho. Often they have seen a demo or read some documentation, and they believe that what they have seen is: THE way "it works" with Pentaho. Consequently they ignore the other 49 ways to make a report. So when the consultant comes in and shows the options, they usually say "ah, I didn't know that was possible" or "why didn't any one tell me this could be done".&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Once customers understand that 'THE way' doesn't exist, but that there are "50 ways to make your report", they automatically get to the next question, being: "what is the best way to make my report?". (Let's face it, people want to simplify things). And here the consulting work gets tricky, as it is impossible to make the answer fully customer independent. One thing however is sure, using all the "50 ways" in the same environment is not recommended. Using all the different possibilities, will require a large set of skills from your IT personnel and will hamper the maintenance work on those reports.&amp;nbsp;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;So, imho, a key element of implementing Pentaho Reporting at a customer, includes a clear study of which different reporting tools and data access methods fit best in the customer's IT architecture, and making a clear selection of which methods they should adopt as standards and which ones they should only use if the standard options don't work. Obviously this "customer strategy" should be&amp;nbsp;aligned with the official Pentaho road map.&lt;/div&gt;&lt;br /&gt;&lt;b&gt;Good, bad or ugly?&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Now the &lt;a href="http://en.wikipedia.org/wiki/The_$64,000_Question"&gt;64.000$ question&lt;/a&gt;&amp;nbsp;is whether all this richness, actually makes a good "Reporting strategy" from a customer point of view? You could say that the picture I made looks pretty ugly or at the least very confusing. And in my experience, that is often how customers perceive it. Once they understand how many possibilities there are, the usually are profoundly confused.&lt;br /&gt;&lt;br /&gt;What they ignore when making this assessment is that Pentaho is an open source initiative, which means that any one can extend the capabilities of the BI server with new reporting possibilities. This happens and will continue to happen, because it is an immediate consequence of an &amp;nbsp;open source environment, &amp;nbsp;So customer must first of all understand that Pentaho solutions will allow to do the same thing in more than one way.&lt;br /&gt;&lt;br /&gt;Now again is that good or bad? I believe I have given the answer to that question already. If a customer doesn't overcomplicate his usage of Pentaho technology, and adapts clear standards, then Pentaho offers reporting strategies that are simple to learn and implement, as well as easy to maintain. It is up to the customer to make the right choices.&lt;br /&gt;&lt;br /&gt;And where does Pentaho stand in all this? As far as I can see, I believe Pentaho should somehow monitor that all the richness of possibilities is explained to their customers and that they are guided in using the right set of possibilities. Over my career as a BI consultant I have seen many BI implementations. Other BI suites that allow for a high "diversity" of possible solutions as SAS BI or Microsoft BI, often resulted into Business Intelligence environments that became technologically hard to understand and impossible to maintain solely because 50 different programmers with a different opinion have come by. Are the vendors to blame for that? Not really. But still some&amp;nbsp;guidelines&amp;nbsp;from the vendor would have helped those poor customers. My experience is that Pentaho deliver this kind of service to its customers. Pentaho's Support, which is extremely well appreciated by customers, typically includes advice that is crucial in a start up phase, and that is a service that few BI vendors offer.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;b&gt;What I left out&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;While writing this post, I realized I left out some reporting options. I quickly throw in what I remember now, but there might be some more stuff.&amp;nbsp;Any one reading this post and want to add something, please add it to the comments section, I would love to see this grow out to a completely complete overview :-)&lt;/div&gt;&lt;ul&gt;&lt;li&gt;You can create Excel based reports, using only Pentaho Data Integration and Excel Writer, see also my &lt;a href="http://kjube.blogspot.com/2011/01/excel-writer-fun.html"&gt;previous blog post&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Similarly you can create PRD reports using Pentaho Data Integration and PRD step, &lt;a href="http://www.ibridge.be/?p=190"&gt;as demonstrated here&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;I didn't mention anything on embedding Pentaho Reports into other applications, as e.g. the&lt;a href="https://plugins.atlassian.com/plugin/details/31610"&gt; Confluence Pentaho reports&lt;/a&gt;&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;Outtro&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;To end this post, I wanted to include a little tribute to Mr.&amp;nbsp;&lt;a href="http://nl.wikipedia.org/wiki/Steve_Gadd"&gt;Steve Gadd&lt;/a&gt;, the man who wrote the incredible drum riff that kicks off Paul Simon's "50 ways to leave your lover". An extremely unusual drum riff but some times the unusual methods deliver the best result. I guess Paul Simon was just lucky to have the right musician available that could deliver him the best groove to fit his song, even if that was a very unconventional one. Which shows in the end that diversity is good.&lt;/div&gt;&lt;br /&gt;&lt;center&gt;&lt;div style="text-align: -webkit-auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;iframe allowfullscreen="" class="youtube-player" frameborder="0" height="390" src="http://www.youtube.com/embed/sZZLLYEzKE8?rel=0" title="YouTube video player" type="text/html" width="580"&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;/center&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-2935707733411948154?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/2935707733411948154/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/01/50-ways-to-make-your-report.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2935707733411948154'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2935707733411948154'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/01/50-ways-to-make-your-report.html' title='50 ways to make your report'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://img.youtube.com/vi/MG-0BWLybIQ/default.jpg' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7568152600393061696</id><published>2011-01-26T15:41:00.000+01:00</published><updated>2011-01-26T15:41:27.782+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>Excel writer fun</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;This post is about making some complex Excel reports using Pentaho data integration and the Excel Writer step.&lt;/span&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;The Excel Writer step&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Some time ago &lt;a href="http://type-exit.org/adventures-with-open-source-bi/about/"&gt;Slawomir Chodnicki&lt;/a&gt; (aka Slawo) released the &lt;a href="http://type-exit.org/adventures-with-open-source-bi/kettle-plugins/excel-writer-plugin/"&gt;Excel Writer Step version 1.2&lt;/a&gt; for Pentaho Data Integration. Not only did he deliver a wonderful piece of extra PDI functionality, but he also accompanied its release with a great &lt;a href="http://type-exit.org/adventures-with-open-source-bi/2010/12/using-the-excel-writer-step/"&gt;blogpost&lt;/a&gt; as well as a sheer endless list of samples.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TTxRDKeH1CI/AAAAAAAAAgY/kQKY3KnQkiw/s1600/pic001.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;img border="0" height="180" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TTxRDKeH1CI/AAAAAAAAAgY/kQKY3KnQkiw/s400/pic001.png" width="400" /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Since the Excel Writer is donation ware, and Slawo is a big kettle contributor, it should come a no surprise that the Excel Writer will be standard included in kettle 4.2. A great contribution, which have been eager to use in a real life situation.&lt;/span&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;The use case&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Last week, I used the Excel Writer for an assignment.&amp;nbsp;The reason why customer wanted the results in Excel were simple. They needed to manipulate both the data as well as the look and feel of the graphs, before using them in Word documents and or&amp;nbsp;Powerpoint documents. Some questions weren't even needed in the final report. So a lot of editing was needed after the creation of the initial report.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;The basic requirements were the following:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Generate Excel reports from the responses to (multiple choice) questionnaires.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Create one Excel sheet with the questionaire, and one Sheet per question showing the answers&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Each Excel report with answers should contain both the figures in a table as well as in a graph.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Each&amp;nbsp;questionnaire&amp;nbsp;can consist of somewhere between 20 to a 100 questions.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Each question can have somewhere between 2 (yes/no) to 10 answers.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;The code&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;The basic code was pretty simple.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;One transformation to fetch all the questions of the questionnaire and write them to an Excel Sheet.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;One transformation to write all the answers to a sheet, to be executed per question&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;+ some logic to clean up the previous version and zip the new version&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TUAl6_V_21I/AAAAAAAAAgc/sMQNvs34NP8/s1600/pic007.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;img border="0" height="296" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TUAl6_V_21I/AAAAAAAAAgc/sMQNvs34NP8/s400/pic007.png" width="400" /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;The output - one Excel file?&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;My first idea was to generate one Excel book, containing 1 sheet with an overview of all the questions in the questionnaire, and next per question 1 sheet with the graphs/tables on the answer. It was based upon two sheet templates and the result was supposed to look like this.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TUArlVnvJQI/AAAAAAAAAgg/dUBdLeOpI0U/s1600/pic011.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;img border="0" height="206" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TUArlVnvJQI/AAAAAAAAAgg/dUBdLeOpI0U/s400/pic011.png" width="400" /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;However when I tried this I ran into the strange issue that the graphs on my Excel template disappeared. I took me some time to figure out that this was a limitation of the Excel writer. A limitation that was even documented, but unfortunately, I don't always read tool tips.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TTxPIU558GI/AAAAAAAAAgU/6YDAzNnCbI8/s1600/desktop005.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;img border="0" height="93" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TTxPIU558GI/AAAAAAAAAgU/6YDAzNnCbI8/s400/desktop005.png" width="400" /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Anyhow, Slawo confirmed me that the POI library that is used, has some limitations:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;&amp;gt;&amp;nbsp;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;You may have hit a limitation here. In order to use a sheet as template&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;&amp;gt; it needs to be copied (only in memory, but a copy non the less). If POI&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;&amp;gt; does not understand things (like charts) it ignores them as best it can.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;&amp;gt; But if you ask POI to copy stuff, it will ignore things it does not understand too.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;Basically it seems that when your sheet template gets copied, our friend the poi library will just randomly leave out stuff he (or she?) doesn't like.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;The output - many Excel files&lt;/b&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: inherit;"&gt;So I started looking for a work around that was still elegant enough. The answer was pretty simple. Don't work with separate sheets, but with separate files. As soon as I disabled the use of the SheetTemplate, and I wrote answers to each question each time to a new Excel sheet, all graphs and lay-out stayed intact.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;To make it a bit easier for the users, to navigate the large number of output files, I used the hyperlink features of the Excel Writer step. Actually putting a hyperlink from the questionnaire to the right file with answers made the whole set of reports easily browsable. And yes, on every sheet with answers, a link back to the overview was inserted.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;I also added a zip step at the end to bind all the reports together into 1 zip file, making it easier for the user to receive the file. (More on that in my next post by the way).&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Some Excel tricks&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Using Excel allowed me (or obliged me) to use some Excel report tricks as dynamically changing the range of values shown in your graph (based on the actual values that are filled in in your table). Or allowing the user to quickly change the numbers in the graph from absolute figures to percentages.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;Even though I'm not such an Excel reporting fan, it was amazing to see that all of your "programming" in Excel actually continues to work nicely (as long as you don't use SheetTemplates that is :-) ).&lt;/span&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Conclusion&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Notwithstanding the issue with the 'disappearing' charts, I must say that I'm pretty pleased with the end result and the actual time to put the whole thing together. Development time was really minimal and given the nice set of samples that Slawo has put out there, using the Excel Writer writer in the correct way was really a piece of cake. And to round it up, I must agree with &lt;a href="http://pentahomusings.blogspot.com/2011/01/apology.html"&gt;Trout&lt;/a&gt;, that Slawo is encredibly quick with problem analysis and answers. And I'm not expressing any opinion about Pentaho support with that :-)&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif; font-size: 13px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7568152600393061696?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7568152600393061696/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/01/excel-writer-fun.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7568152600393061696'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7568152600393061696'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/01/excel-writer-fun.html' title='Excel writer fun'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/TTxRDKeH1CI/AAAAAAAAAgY/kQKY3KnQkiw/s72-c/pic001.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7340447034701833069</id><published>2011-01-03T19:02:00.001+01:00</published><updated>2011-01-15T18:37:59.305+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>Mailing New Years cards</title><content type='html'>It is no secret that at &lt;a href="http://www.kjube.be/"&gt;kJube&lt;/a&gt; we like &lt;a href="http://kettle.pentaho.com/"&gt;Pentaho Data Integration&lt;/a&gt;. This year, in order to send all our best wishes for the new year to customers and partners, we used our kettle mailer engine again (aka Normal Mailer). Since it has been something we wanted to put out there, I thought it was a good idea to &lt;a href="http://www.kjube.be/code/mailer.zip"&gt;share a bit of code&lt;/a&gt;. Even though this isn't the whole solution, I still thought, that there might be one or two people that still need to send cards. So serve yourself.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The basics&lt;/b&gt;&lt;br /&gt;How does it work? If you unzip the code, you'll see that there is just one transformation, named "mailer.ktr".&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TSHtRuWVY3I/AAAAAAAAAgQ/jixLNl9RIrw/s1600/pic319.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="205" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TSHtRuWVY3I/AAAAAAAAAgQ/jixLNl9RIrw/s640/pic319.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;This&amp;nbsp;transformation uses two input files: "mailing_list.xls" and "test.html" as inputs.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;/div&gt;&lt;ul&gt;&lt;li&gt;"Mailing_list.xls" contains the list of persons to who you want to send mails.&lt;/li&gt;&lt;li&gt;"test.html" is the standard mail that will be sent when running the transformation. Obviously you can create whatever mail template you want. You can provide it as input parameter when running the transformation, as the name of the mail template is a named parameter. In the example you'll find our new year's card "2011.html"&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;The transformation also has an output file, which is the same as the input file. It will add a tab in your "mailing_list.xls" with the name of your mail template and the result of the mailing. By the way, we've used the &lt;a href="http://type-exit.org/adventures-with-open-source-bi/kettle-plugins/excel-writer-plugin/"&gt;Excel Writer step&lt;/a&gt; here. So you might need to add that to your kettle deployment, or you could just disable the tracking of results if you don't care about that part.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In order for the transformation to work you also need to add some variables to your kettle.properties which are more or less self-explanatory.&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;KFF_MAIL_SMTP_SERVER&lt;/li&gt;&lt;li&gt;KFF_MAIL_SMTP_PORT&lt;/li&gt;&lt;li&gt;NORMAN_MAILER_SENDER_NAME&lt;/li&gt;&lt;li&gt;NORMAN_MAILER_SENDER_ADDRESS&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Ideas for extending&lt;/b&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Obviously this idea can be extended. Some things that have crossed my mind are the following.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;It would be nice to be able to insert variables into the mail template, which can then be replace at run time. E.g. add somewhere in your mail template "Dear ${SALUTATION} ${LASTNAME}" and replace this with actual values from the mailing list. I guess &lt;a href="http://rpbouman.blogspot.com/2010/12/substituting-variables-in-kettle.html"&gt;Roland's latest blog&lt;/a&gt; post should offer some possibilities there.&lt;/li&gt;&lt;li&gt;Obviously when you send these mails, the sender mail address that you use, will probably get some "undeliverable" replies. Most of the time your mailing list isn't correct. Reading out the mailbox and automatically figuring out which email addresses are invalid would be a great feature. With the POP3 step and some Regex magic that shouldn't be to hard either.&lt;/li&gt;&lt;li&gt;Another cute feature would be to offer people to "unsubscribe" from your mailing list. Again, providing a mail address and a standard subject, like "Unsuscribe" in combination with the POP3 step and some Regex should take care of that.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;If any one wants feels like extending this a bit, feel free to &lt;a href="http://www.opensourcebusinessintelligence.be/sit/index.php?section=11"&gt;contact us&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Oh, and a happy new year to all of you !&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/images/christmastree2011.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="298" src="http://www.kjube.be/images/christmastree2011.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7340447034701833069?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7340447034701833069/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2011/01/mailing-new-years-cards.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7340447034701833069'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7340447034701833069'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2011/01/mailing-new-years-cards.html' title='Mailing New Years cards'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_C0PnWJwDRZY/TSHtRuWVY3I/AAAAAAAAAgQ/jixLNl9RIrw/s72-c/pic319.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8413524038750715436</id><published>2010-10-19T09:32:00.000+02:00</published><updated>2010-10-19T09:32:57.242+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - KFF'/><title type='text'>Table compare - test automation</title><content type='html'>&lt;div style="text-align: justify;"&gt;&lt;b&gt;Background&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Some time ago we released a &lt;a href="http://kjube.blogspot.com/2010/09/kff-slowly-coming-out-of-kitchen-closet.html"&gt;&lt;b&gt;&lt;i&gt;series of kettle plugins&lt;/i&gt;&lt;/b&gt;&lt;/a&gt; as part of the &lt;a href="http://kff.kjube.be/"&gt;&lt;b&gt;&lt;i&gt;Kettle Francising Factory&lt;/i&gt;&lt;/b&gt;&lt;/a&gt;. The plugins (&lt;a href="http://code.google.com/p/kettle-franchise/downloads/detail?name=kff-plugins4.jar&amp;amp;can=2&amp;amp;q="&gt;&lt;b&gt;&lt;i&gt;kff-plugins3.jar&lt;/i&gt;&lt;/b&gt;&lt;/a&gt;&amp;nbsp;for PDI version 3.2.3 and above and &lt;a href="http://code.google.com/p/kettle-franchise/downloads/detail?name=kff-plugins4.jar&amp;amp;can=2&amp;amp;q="&gt;&lt;b&gt;&lt;i&gt;kff-plugins4.jar&lt;/i&gt;&lt;/b&gt;&lt;/a&gt;&amp;nbsp;for PDI 4.0.0 and above) are available on &lt;b&gt;&lt;i&gt;&lt;a href="http://code.kjube.be/"&gt;code.kjube.be&lt;/a&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;&lt;span class="Apple-style-span" style="font-style: normal;"&gt;&amp;nbsp;but sofar not much documentation has been provided.&lt;/span&gt;&lt;/span&gt;&lt;/i&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;One of the kettle steps that is available in those jar files is the &lt;b&gt;Table Compare&lt;/b&gt;. This step&amp;nbsp;does what its name says. It compares the data from two tables (provided they have the same lay-out. It'll find differences between the data in the two tables and log it.&amp;nbsp;We developed this plugin for acceptance tests scenarios in large projects. Consider (as a hypothetical example :-) ) the following use case of an data integration tool migration.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Suppose you have a data warehouse which is being loaded using, well, let's say, &lt;a href="http://www.oracle.com/technetwork/developer-tools/warehouse/overview/index.html"&gt;&lt;b&gt;&lt;i&gt;Oracle Warehouse Builder&lt;/i&gt;&lt;/b&gt;&lt;/a&gt;. Now Oracle has bought &lt;a href="http://www.oracle.com/corporate/press/2006_oct/sunopsis.html"&gt;&lt;b&gt;&lt;i&gt;Sunopsis&lt;/i&gt;&lt;/b&gt;&lt;/a&gt;&amp;nbsp;in 2006. Since then the development of OWB has somewhat stalled :-). Sometime after Oracle has anounced the launch of a new product called ODI - Oracle Data Integrator (very appropriate name, btw). This product combines the best of both worlds (or so Oracle sales reps state) but most Oracle Warehouse Builder customers know since a long time that migrating their code from OWB to ODI is not easy. And here, out of necessity, an opportunity arises. &lt;a href="http://kjube.blogspot.com/2010/08/release-4666-aka-bitter-pain-edition.html"&gt;&lt;b&gt;&lt;i&gt;If you are faced with a painful and costly migration&lt;/i&gt;&lt;/b&gt;&lt;/a&gt;, which you can only post-phone as long as your support contract allows, &lt;b&gt;&lt;i&gt;why not move to a cheaper data integration tool&lt;/i&gt;&lt;/b&gt;, as let's say, &lt;b&gt;&lt;i&gt;&lt;a href="http://kettle.pentaho.com/"&gt;kettle&lt;/a&gt;&lt;/i&gt;&lt;/b&gt;?&amp;nbsp;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The above migration scenario is exactly the type of projects we do at &lt;b&gt;&lt;a href="http://www.kjube.be/"&gt;&lt;i&gt;kJube&lt;/i&gt;&lt;/a&gt;&lt;/b&gt;. I'm not going to go into detail on this type of projects, but one element I do want to underline here is the following:&lt;b&gt;&lt;i&gt; If you cannot automate user acceptance testing you can forget about doing this type of projects&lt;/i&gt;&lt;/b&gt;.&amp;nbsp;The Table Compare step does exactly this.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;So what does the thing do?&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Conceptually the Table Compare does the following for each pair of tables you hand it. &lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLy2S3sfsQI/AAAAAAAAAfU/zG1gqrIoJpA/s1600/pic249.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="240" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLy2S3sfsQI/AAAAAAAAAfU/zG1gqrIoJpA/s400/pic249.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;It will count the records in each table and make the result of that count available.&lt;/li&gt;&lt;li&gt;It will do a left, right and inner join between the two tables (the counts of those statistics aren't available&lt;/li&gt;&lt;li&gt;All the records that appear only in the right or left join are logged as 'Errors'&lt;/li&gt;&lt;li&gt;All the records that are common between the Reference and Compare table are put through a detailed compare on attribute level. All attributes that don't match are logged as 'Errors'.&amp;nbsp;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;How to use it?&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;Now  you have gotten the conceptual explanation on the Table Compare, I  guess it is time for the technical stuff. As you can see, the Table  Compare step contains quite a few fields that require input.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLy328d8ZZI/AAAAAAAAAfY/dsxPncwnl20/s1600/pic250.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="362" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLy328d8ZZI/AAAAAAAAAfY/dsxPncwnl20/s400/pic250.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The "&lt;i&gt;Reference connection&lt;/i&gt;" and "&lt;i&gt;Compare connection&lt;/i&gt;" are the database connections from which the reference/compare table data will come.&lt;/li&gt;&lt;li&gt;The "&lt;i&gt;Reference schema field&lt;/i&gt;" and "&lt;i&gt;Compare schema field&lt;/i&gt;" contain the schema names for the reference/compare table.&lt;/li&gt;&lt;li&gt;The "&lt;i&gt;Reference table field&lt;/i&gt;" and "&lt;i&gt;Compare table field&lt;/i&gt;" contain the actual table names. This means that you could compare two tables with a different name, as long as they have the same column names.&lt;/li&gt;&lt;li&gt;The "&lt;i&gt;Key fields field&lt;/i&gt;" should contain a comma separated list of they fields that make up the 'primary' key of the table(s) you are comparing. The primary key is needed because without this information the two tables cannot be correctly joined.&lt;/li&gt;&lt;li&gt;The "&lt;i&gt;Exclude fields field&lt;/i&gt;" contains a comma separated list of columns that you want to exclude from the comparison. E.g. because they exist in the first table, but not in the second.&lt;/li&gt;&lt;li&gt;The "&lt;i&gt;Number of errors field&lt;/i&gt;" allows you to specify the name of the output column that will contain the total number of errors found for the comparison of your tables.&lt;/li&gt;&lt;li&gt;The "&lt;i&gt;Number of reference/compare table records field&lt;/i&gt;" allows you to specify the name of the field that will contain the actual number of records found in each table.&lt;/li&gt;&lt;li&gt;The "&lt;i&gt;Number of left/inner/right join errors field&lt;/i&gt;" allows you to specify the name of the field(s) that will contain the number of errors found for each join type.&lt;/li&gt;&lt;li&gt;The "Error handling key description input field" allows you specify the name of the output field for the 'where clause" of the record that gave an error.&lt;/li&gt;&lt;li&gt;The "Error handling reference/compare value input field" allows you to specify the output field names for the actual values that differ.&lt;/li&gt;&lt;/ul&gt;&lt;b&gt;Example &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you find all of the above &lt;b&gt;pretty confusing&lt;/b&gt; that is understandable. There is a lot of fields but most of them have little importance. They just allow you to choose how your field will be name, but have few functional importance. Still, in order to improve your understanding of the subject, we thought an &lt;b&gt;example was in place&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;In order to show you the example we needed some tables in an &lt;b&gt;online database&lt;/b&gt; that we can compare. We found the information contained in the &lt;i&gt;&lt;b&gt;&lt;a href="http://www.ensembl.org/"&gt;Ensembl project&lt;/a&gt;&lt;/b&gt;&lt;/i&gt; very suitable for this purpose. What is the project about?&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;blockquote&gt;The Ensembl project produces genome databases   &lt;br /&gt;for vertebrates and other eukaryotic species, &lt;br /&gt;and makes this information   freely available online. &lt;/blockquote&gt;&lt;/div&gt;Basically this project has a large amount of databases (one per species) that all have a similar structure. Perfect for our purpose. There are plenty of species available for comparison, but we picked:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLy-uEUIQKI/AAAAAAAAAfk/lNvW33U8HrQ/s1600/pic252.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="60" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLy-uEUIQKI/AAAAAAAAAfk/lNvW33U8HrQ/s200/pic252.png" width="200" /&gt;&lt;/a&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLy-o_xBFkI/AAAAAAAAAfg/TlEwAhmxZMQ/s1600/pic253.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="66" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLy-o_xBFkI/AAAAAAAAAfg/TlEwAhmxZMQ/s200/pic253.png" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;We just picked two tables from each database and put them through the Compare Table step for demonstration purposes. The transformation is shown below (and is also &lt;i&gt;&lt;b&gt;&lt;a href="http://www.kjube.be/code/dbcompare.ktr"&gt;available for download&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;).&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLy2QHAWrkI/AAAAAAAAAfQ/A0U2PGBVgwc/s1600/pic251.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="182" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLy2QHAWrkI/AAAAAAAAAfQ/A0U2PGBVgwc/s400/pic251.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;As the first step we used the data grid step to decide on which tables to run through the Compare step.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLzAN0MwvsI/AAAAAAAAAfo/Exofw_pzjn0/s1600/pic254.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="80" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLzAN0MwvsI/AAAAAAAAAfo/Exofw_pzjn0/s400/pic254.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Previewing the first output ('Comparison Statistics') delivers the following:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLzBQSrFveI/AAAAAAAAAfs/K45iJNBKdiw/s1600/pic255.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="61" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLzBQSrFveI/AAAAAAAAAfs/K45iJNBKdiw/s400/pic255.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;It shows that both the 'analysis' and 'attrib_typ' table have a different number of records for the human vs chimp database. (Luckily?)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Previewing the second output ('Comparison Error Details') shows some details on the actual differences (in this case the inner join details).&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLzB_wXUsSI/AAAAAAAAAfw/ZpBtCKL3cd4/s1600/pic256.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="122" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLzB_wXUsSI/AAAAAAAAAfw/ZpBtCKL3cd4/s400/pic256.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;Clearly the record with analysis_id=2 has different values for ALL columns in the table.&lt;br /&gt;&lt;br /&gt;Hopefully this sample helps to understand what the Table Compare can do. The best way to see it is to download the .ktr and give it a spin. We'll also add the .ktr to the KFF project as a project template. So you'll also find the code in the next KFF release.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Improvements&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;We know already that the following improvements would be handy:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Allow the connections to be field names that are accepted from the previous step. That would allow to do testing across more than two connections.&lt;/li&gt;&lt;li&gt;Save the following statistics:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;nrRecordsInnerJoin&lt;/li&gt;&lt;li&gt;nrRecordsLeftJoin&lt;/li&gt;&lt;li&gt;nrRecordsRightJoin&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;The 3 error fields are currently expected as input fields in the step, this should be corrected. Also their name might be more appropriately be output field :-)&lt;/li&gt;&lt;/ul&gt;... but if you have further suggestions to improve this step, please let us know.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8413524038750715436?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8413524038750715436/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/table-compare-test-automation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8413524038750715436'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8413524038750715436'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/table-compare-test-automation.html' title='Table compare - test automation'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/TLy2S3sfsQI/AAAAAAAAAfU/zG1gqrIoJpA/s72-c/pic249.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8456773214497594259</id><published>2010-10-18T02:15:00.047+02:00</published><updated>2010-10-19T00:10:50.181+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - KFF'/><title type='text'>KFF &amp; Cookbook</title><content type='html'>&lt;div style="text-align: justify;"&gt;When we (&lt;i&gt;&lt;b&gt;&lt;a href="http://www.blogger.com/goog_2011445078"&gt;Matt&lt;/a&gt;&lt;a href="http://www.opensourcebusinessintelligence.be/sit/index.php?section=11"&gt; and myself&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;) kicked off the &lt;i&gt;&lt;b&gt;&lt;a href="http://kff.kjube.be/"&gt;KFF project&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;, one of the things we wanted to include was auto-documentation. Matt made a small proof of concept to generate PDF documentation but that didn't reach finalization. At the same time, while &lt;i&gt;&lt;b&gt;&lt;a href="http://www.ibridge.be/?page_id=2"&gt;Matt&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;, &lt;i&gt;&lt;b&gt;&lt;a href="http://rpbouman.blogspot.com/2010/03/writing-another-book-pentaho-kettle.html"&gt;Roland&lt;/a&gt;&lt;/b&gt;&lt;/i&gt; and &lt;i&gt;&lt;b&gt;&lt;a href="http://www.tholis.com/"&gt;Jos&lt;/a&gt;&lt;/b&gt;&lt;/i&gt; were busy writing &lt;i&gt;&lt;b&gt;&lt;a href="http://kjube.blogspot.com/2010/10/pentaho-kettle-solutions-overview.html"&gt;Pentaho Kettle Solutions&lt;/a&gt;&lt;/b&gt;&lt;/i&gt; (Sorry, what were you thinking? &lt;i&gt;&lt;b&gt;&lt;a href="http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html"&gt;You do not own a copy of this book?&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;), the subject of auto-documentation came up too. Roland picked it up, and turned it into the &lt;i&gt;&lt;b&gt;&lt;a href="http://code.google.com/p/kettle-cookbook/"&gt;kettle cookbook&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;, a great auto-documentation tool for &lt;i&gt;&lt;b&gt;&lt;a href="http://kettle.pentaho.com/"&gt;kettle&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;.&lt;/div&gt;&lt;br /&gt;&lt;b&gt;Why did we want auto-documentation in the first place?&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;The reason we wanted auto documentation is that &lt;b&gt;we are lazy&lt;/b&gt;. &lt;i&gt;&lt;b&gt;&lt;a href="http://plato.stanford.edu/entries/self-knowledge/"&gt;We know this&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;. Actually we&amp;nbsp; have known this since a long time. So we needed a solution that would &lt;b&gt;minimize effort&lt;/b&gt; on our side.&lt;br /&gt;&lt;br /&gt;Also it often turns out that customers do not really want to pay for documentation. They just see that as part of the development process and certainly don't want to pay for any time you put into documentation. So &lt;b&gt;minimizing effort is also keeping costs low&lt;/b&gt;, which seems to please customers.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Another reason for wanting auto documentation is that over the years we witnessed projects where the documentation of data integration&amp;nbsp; code was something in the line of a gigantic word document filled with screenshots - those very bulky screenshots made using &lt;i&gt;&lt;b&gt;&lt;a href="http://en.wikipedia.org/wiki/Paint_%28software%29"&gt;MS (com)P(l)AINT&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;. Obviously that kind of documentation stays-up-to date until 5 minutes after the next error in your data integration run. And &lt;b&gt;stale documenation&lt;/b&gt; is even worse than no documentation.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;So what was the way to go?&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;We quickly concluded that something or some one had to document the code for us - continuously. Since you cannot outsource everything to India; and since mind-reading   solutions aren't just there yet, we thought along the lines of   generating documentation from the code itself. What we didn't know  is that it could even get better, and that Roland would write the auto-documenation tool and in doing so really minimized effort for every one.&lt;/div&gt;&lt;br /&gt;&lt;b&gt;About kettle documentation possibilities&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Now before zooming in on the cookbook, I would light to high-light some nice documentation features that are in kettle since quite some time. The examples below are taken from &lt;i&gt;&lt;a href="http://kff.kjube.be/"&gt;KFF&lt;/a&gt;&lt;/i&gt;, namely from the batch_launcher.kjb.&lt;/div&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;1) Job/Transformation properties&lt;/b&gt;&lt;/i&gt;&lt;br /&gt;Ever kettle job or transformation has a serie of meta-data tags on the properties tab, accessible by right clicking on the canvas of spoon (or through the menu).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLmsTh2bB_I/AAAAAAAAAeo/_nzo3gz90pw/s1600/pic235.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="185" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLmsTh2bB_I/AAAAAAAAAeo/_nzo3gz90pw/s320/pic235.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The available tags are the following:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Name: The name of the transformation/job. This name doesn't need to be equal to the physical name of the XML file in which you want to save the code, although not aligning the physical &lt;/li&gt;&lt;li&gt;Description: A short description. Short as in: fits on one line.&lt;/li&gt;&lt;li&gt;Extended description: A full text description&lt;/li&gt;&lt;li&gt;Status: Draft or Production&lt;/li&gt;&lt;li&gt;Version: A free text version number&lt;/li&gt;&lt;li&gt;Created by/at: ID of creator and timestamp of creation&lt;/li&gt;&lt;li&gt;Modified by/at: ID of modifier and timestamp of modification&lt;/li&gt;&lt;/ul&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLmpDDyJ_nI/AAAAAAAAAeY/drrG_xcNLYc/s1600/pic233.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="280" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLmpDDyJ_nI/AAAAAAAAAeY/drrG_xcNLYc/s400/pic233.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;div style="text-align: left;"&gt;To my experience this gives quite a few fields to stick in some elementary descriptions of functionality.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;i&gt;&lt;b&gt;2) Canvas notes&lt;/b&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;div style="text-align: justify;"&gt;First of all the fact that there are really &lt;b&gt;no lay-out restrictions&lt;/b&gt; in how you organize a data integration job or transformation is a strong documentation feature by itself. Many ETL tools will oblige you to always work left to right, or oblige you to always see every step on attribute level. Often that makes the view a developer has of the canvas, well, not much of an overview. In kettle you do not run into that issue.&amp;nbsp;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Because of the fact that you can design jobs in a modular way (using sub-jobs), you can also ssure that you never need to design a job/transformation that looks like&amp;nbsp; the one below .&amp;nbsp; (For the record: I didn't design the below transformation myself.)&amp;nbsp; Obviously now I'm stating that a good data integration design, makes documentation readable, which is a bit beyond pure documentation functionality, but still, it is an important thing to consider when thinking about auto-documenting your solution.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TLsKG8pv8-I/AAAAAAAAAes/AhYfkg0ubsY/s1600/pic236.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="191" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TLsKG8pv8-I/AAAAAAAAAes/AhYfkg0ubsY/s400/pic236.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;On top of the great lay-out possibilities, you can insert notes on the canvas of any job/transformation . They allow for free text comments (without lay-out possibilities). This is good to document things that still need finalizing, to highlighte certain elements of your job/transformation, important remarks like 'don't ever change this setting', etc. &lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLuNiLffyuI/AAAAAAAAAew/Inl5QBQCiko/s1600/pic237.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="102" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLuNiLffyuI/AAAAAAAAAew/Inl5QBQCiko/s400/pic237.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;div style="text-align: justify;"&gt;Although the notes aren't actually linked to any steps, the vicinity of a note to a step is good enough to show what step the comment actually belongs to. And in case you really want to link your comments to specific steps there also are 'Step descriptions'.&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;i&gt;&lt;b&gt;3) Step descriptions&lt;/b&gt;&lt;/i&gt;&lt;br /&gt;Step description are available through a simple right click on the step you want to document.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLmrh1ouv6I/AAAAAAAAAek/-3KOCVOGScQ/s1600/pic234.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="147" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLmrh1ouv6I/AAAAAAAAAek/-3KOCVOGScQ/s400/pic234.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;A step description dialog opens up and you can take down any comments related to the step you clicked in free text format (no formatting).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLmpMkUSXQI/AAAAAAAAAeg/ANlKx6lHWZA/s1600/pic231.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="288" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLmpMkUSXQI/AAAAAAAAAeg/ANlKx6lHWZA/s400/pic231.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;All in all, kettle as a tool, has great lay-out possiblities and sufficient documentation 'place holders' to stuff your comments in. The next thing is to get that information back out.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Cookbook&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;As I wrote in the intro of this post, &lt;i&gt;&lt;b&gt;&lt;a href="http://rpbouman.blogspot.com/"&gt;Roland Bouman&lt;/a&gt;&lt;/b&gt;&lt;/i&gt; put together an auto-documentation tool for kettle during the writing of &lt;i&gt;&lt;b&gt;&lt;a href="http://kjube.blogspot.com/2010/10/pentaho-kettle-solutions-overview.html"&gt;Pentaho Kettle Solutions&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;. He presented this to the Pentaho Community even before the release of the book, both in a &lt;i&gt;&lt;b&gt;&lt;a href="http://wiki.pentaho.com/display/COM/August+18%2C+2010+-+Roland+Bouman+-+Kettle+Cookbook"&gt;webcast&lt;/a&gt;&lt;/b&gt;&lt;/i&gt; as well as on the &lt;i&gt;&lt;b&gt;&lt;a href="http://wiki.pentaho.com/display/COM/Pentaho+Community+Gathering+-+Portugal+2010"&gt;Pentaho Community Gathering 2010&lt;/a&gt;&lt;/b&gt;&lt;/i&gt; in Cascais (&lt;i&gt;&lt;b&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_RolandBouman.pdf"&gt;presentation here&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;What does the cookbook do? Well, basically it will read all kettle jobs and transformations in a specific directory (INPUT_DIR) and generate html documentation for this code in another directory (OUTPUT_DIR) using the software you already have installed, namely kettle.&amp;nbsp;In other words, if you are a kettle user, you just need to tell the cookbook code where our code is and where you want to documentation to go. I'm not sure if it could get more simple than that. Yet, &lt;b&gt;as far as I know this is the only data integration tool that actually is capable of auto-documenting itself&lt;/b&gt;!&amp;nbsp;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;Cookbook features&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The feature I'd like to show is that all your code is transformed into &lt;b&gt;html pages&lt;/b&gt; which maintain the &lt;b&gt;folder structure&lt;/b&gt; that you might have given to your project. In my example I've auto-documented the /kff/reusable folder, which looks like this:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLuPB7DTMUI/AAAAAAAAAe0/ZDhIin82YKo/s1600/pic238.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLuPB7DTMUI/AAAAAAAAAe0/ZDhIin82YKo/s400/pic238.png" width="243" /&gt;&lt;/a&gt;&lt;/div&gt;So basically per job/transformation you have 1 html page, which is located in a directory structure that matches perfectly your original directory structure. Plain and simple.&lt;br /&gt;&lt;br /&gt;Obviously the tree view shown here is clickable and allows you to navigate directly to any job/transformation you might want to explore.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;div style="margin: 0px;"&gt;On each page (for each job/transformation) quite an extensive amount of information is listed out.&amp;nbsp;First you find the&amp;nbsp;&amp;nbsp;&lt;b&gt;meta-data tags&lt;/b&gt; from the properties tab. The below screenshot matches the batch_launcher.kjb properties as shown above. Note that the fields "version" and "status" aren't exported for some reason but apart from that all the fields are there.&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLuP2l8aCdI/AAAAAAAAAe4/_AMyJnF2Its/s1600/pic241.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="77" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLuP2l8aCdI/AAAAAAAAAe4/_AMyJnF2Its/s200/pic241.png" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;After the meta-data elements, the &lt;b&gt;parameters&lt;/b&gt; a job might expect are listed out. In case of our batch_launcher.kjb these are the following. Since the named parameters are quite important for the understanding of a transformation, it is appropriate they are listed on top of the page.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLuQYaY_qTI/AAAAAAAAAe8/LLkknknKcbw/s1600/pic242.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="71" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLuQYaY_qTI/AAAAAAAAAe8/LLkknknKcbw/s400/pic242.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Next you'll find an export of the &lt;b&gt;actual canvas&lt;/b&gt; you see in spoon in your documentation, &lt;b&gt;including all the notes&lt;/b&gt;. Now this is true magic. The screenshots in the documentation are exactly like what you see on the canvas in spoon. And the steps are&amp;nbsp;&lt;b&gt;clickable&lt;/b&gt;. The'll bring you right to the job or transformation that the step refers to, or to the description of the step. In other words, &lt;b&gt;you can drill down&lt;/b&gt; from jobs to sub-jobs to transformations to steps as you would in spoon. That is no less than amazing!&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLuQyj-s6ZI/AAAAAAAAAfA/uUHPrO6w12o/s1600/pic244.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="247" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLuQyj-s6ZI/AAAAAAAAAfA/uUHPrO6w12o/s400/pic244.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;The &lt;b&gt;step descriptions&lt;/b&gt; themselves are listed lower on the page. In the below screenshot you'll see the step descriptions we entered for the step 'kff_logging_init' before. (Note that page breaks are lost.)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLuRjd4yWsI/AAAAAAAAAfE/sZQ2Wsb3kFc/s1600/pic245.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="196" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLuRjd4yWsI/AAAAAAAAAfE/sZQ2Wsb3kFc/s400/pic245.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;However if you look at the step descriptions that do not just launch another job or transformation you even get some of the actual code. Look at this table input step where you actually get the &lt;b&gt;SQL code&lt;/b&gt; that is executed.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLuUCjEpDGI/AAAAAAAAAfI/muKYh77TorU/s1600/pic246.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="182" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLuUCjEpDGI/AAAAAAAAAfI/muKYh77TorU/s400/pic246.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;All in all, the cookbook generates &lt;b&gt;amazingly detailed documentation&lt;/b&gt;. In case you aren't convinced by the screenshots and explanation above, please check for yourself below (or in a &lt;a href="http://www.kjube.be/kffdoc/index.html"&gt;&lt;b&gt;&lt;i&gt;full browser window&lt;/i&gt;&lt;/b&gt;&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;&lt;iframe height="400" src="http://www.kjube.be/kffdoc/index.html" width="100%"&gt;&amp;amp;lt;p&amp;amp;gt;&amp;amp;amp;amp;amp;lt;p&amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;lt;p&amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;lt;p&amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;lt;p&amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;lt;br /&amp;amp;amp;amp;amp;gt;   &amp;amp;amp;amp;amp;lt;p&amp;amp;amp;amp;amp;gt;Your browser does not support iframes.&amp;amp;amp;amp;amp;lt;/p&amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;lt;/p&amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;lt;/p&amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;lt;/p&amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;lt;/p&amp;amp;amp;amp;amp;gt;&amp;amp;lt;/p&amp;amp;gt;&lt;/iframe&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;KFF &amp;amp; Kettle-Cookbook&lt;/b&gt;&lt;br /&gt;After the above explanation it doesn't need much clarification that integrating KFF and the cookbook was peanuts. The KFF directory structure is clear.&lt;br /&gt;&lt;blockquote&gt;/kff/projects/my_customer/my_project/code &amp;nbsp;--&amp;gt;contains your code&lt;/blockquote&gt;&lt;blockquote&gt;/kff/projects/my_customer/my_project/doc &amp;nbsp;--&amp;gt; contains the documentation&lt;/blockquote&gt;So the INPUT_DIR and OUTPUT_DIR for connecting the cookbook to KFF are clear. The only thing needed was to add a step to the batch_launcher.kjb which called the top level job of the Cookbook and pass it two variables.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLuW8Mv6AlI/AAAAAAAAAfM/xk--R4dGNmA/s1600/pic247.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="72" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLuW8Mv6AlI/AAAAAAAAAfM/xk--R4dGNmA/s400/pic247.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;As I said, it was extremely simple to connect KFF to Cookbook&lt;br /&gt;&lt;br /&gt;&lt;b&gt;So from our next release on, if you download and install KFF, you'll automatically have a download of the Kettle-Cookbook in there, and whether you want it or not, all your projects will be auto-documented. You just need to figure out how to share the /kff/projects/my_customer/my_project/doc directory with people who would actually like to read the manual.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A big thanks to Roland!&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8456773214497594259?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8456773214497594259/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/10/kff-cookbook.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8456773214497594259'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8456773214497594259'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/10/kff-cookbook.html' title='KFF &amp; Cookbook'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/TLmsTh2bB_I/AAAAAAAAAeo/_nzo3gz90pw/s72-c/pic235.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6499206000114508116</id><published>2010-10-14T21:01:00.002+02:00</published><updated>2010-10-19T00:11:28.461+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>IE6</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TKm784Ki00I/AAAAAAAAAdg/D8QUrUi0Wm4/s1600/GCalOnIE6.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" px="true" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TKm784Ki00I/AAAAAAAAAdg/D8QUrUi0Wm4/s400/GCalOnIE6.JPG" width="500" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Thank you Google people for throwing this message in our faces several times a day!! &amp;nbsp;Unfortunately some of our clients are large organisations (banks, government, etc) who really believe that Internet Explorer 6 was the end of web browsing evolution. So please Google people, stop developing for the web, there really is no future in that. &amp;nbsp;)-:&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6499206000114508116?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6499206000114508116/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/10/ie6.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6499206000114508116'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6499206000114508116'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/10/ie6.html' title='IE6'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/TKm784Ki00I/AAAAAAAAAdg/D8QUrUi0Wm4/s72-c/GCalOnIE6.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8090666131202649752</id><published>2010-10-12T07:15:00.000+02:00</published><updated>2010-10-10T16:11:15.632+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>PCG10 in figures</title><content type='html'>PCG10, the Pentaho Community Meeting 2010 in Lisbon is over (safe to say after over 2 weeks), and attention on the live blog is slowly dying. So I thought, full attention on &lt;a href="http://kff.kjube.be/"&gt;&lt;b&gt;&lt;i&gt;KFF&lt;/i&gt;&lt;/b&gt;&lt;/a&gt; now.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But Paul Stoellberger twittered the following statistics yesterday:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLC5uswHBdI/AAAAAAAAAeE/KzDvcrw9bV0/s1600/pic225.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="65" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TLC5uswHBdI/AAAAAAAAAeE/KzDvcrw9bV0/s400/pic225.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;which made me think of looking up some of the Google Analytics statistics to share. After all, we are into business intelligence, aren't we?&lt;br /&gt;&lt;br /&gt;So what has been the interest in the PCG10 presentations I added to the live blog post?&lt;br /&gt;&lt;br /&gt;Here's the general statistics for the first 2 weeks:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLDAO4T4_PI/AAAAAAAAAeM/x09D9b1TfiI/s1600/pic227.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="250" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TLDAO4T4_PI/AAAAAAAAAeM/x09D9b1TfiI/s400/pic227.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Obviously the first days was highest, dropping after the 5th day. However, even though we had only 15 page views or so per day after that initial period, people still spend some time reading the pages. I'm curious how long that attention will last.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TLC_wLkOsdI/AAAAAAAAAeI/cY40AYfwepo/s1600/pic226.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TLC_wLkOsdI/AAAAAAAAAeI/cY40AYfwepo/s400/pic226.png" width="520" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;So where are the visitors from. Well, mainly it would seem that the Pentaho Community Event is a American-European party. Attention from other continents is extremely limited.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLDAdzEx1VI/AAAAAAAAAeQ/traE6pk9-mY/s1600/pic228.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TLDAdzEx1VI/AAAAAAAAAeQ/traE6pk9-mY/s400/pic228.png" width="520" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Top 20 countries (out of 37) are listed below. I was mostly amazed by Brazil. I would have expected more attention from that country, given the push to organize PCG next year :-) &amp;nbsp;And I was pleasanty surprised to see that adding the Dutch and Belgian contribution together (no contributions from Luxemburg) would put the Benelux in third position. Small countries, ahoi!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TLDA0wgouGI/AAAAAAAAAeU/5_B64i_biUw/s1600/pic229.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TLDA0wgouGI/AAAAAAAAAeU/5_B64i_biUw/s400/pic229.png" width="520" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;And further, well, no more comments, let the figures speak for themselves I would say !&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8090666131202649752?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8090666131202649752/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-in-figures.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8090666131202649752'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8090666131202649752'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-in-figures.html' title='PCG10 in figures'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/TLC5uswHBdI/AAAAAAAAAeE/KzDvcrw9bV0/s72-c/pic225.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-2891323398847433837</id><published>2010-10-08T18:06:00.004+02:00</published><updated>2010-10-08T18:14:43.260+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><category scheme='http://www.blogger.com/atom/ns#' term='Book'/><title type='text'>Pentaho Kettle Solutions Overview</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: Tahoma, 'Lucida Sans', Verdana, Arial, serif; font-size: 19px; color: rgb(51, 51, 51); line-height: 30px; "&gt;&lt;p style="margin-top: 0em; margin-right: 0em; margin-bottom: 0,5em; margin-left: 0em; padding-top: 0em; padding-right: 0em; padding-bottom: 0em; padding-left: 0em; line-height: 1,8em; "&gt;&lt;/p&gt;&lt;div style="color: rgb(0, 0, 0); font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: rgb(255, 255, 255); font: normal normal normal 13px/19px 'Lucida Grande', 'Lucida Sans Unicode', Tahoma, Verdana, sans-serif; line-height: normal; "&gt;&lt;p&gt;Dear Kettle friends,&lt;/p&gt;&lt;p&gt;Great news! Copies of our new book &lt;span mce_name="em" mce_style="font-style: italic;" class="Apple-style-span" style="font-style: italic; "&gt;&lt;a href="http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html" mce_href="http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html" target="_blank"&gt;Pentaho Kettle Solutions&lt;/a&gt; &lt;/span&gt;are finally shipping.  &lt;a href="http://rpbouman.blogspot.com/" mce_href="http://rpbouman.blogspot.com/"&gt;Roland&lt;/a&gt;, Jos and myself worked really hard on it and, as you can probably imagine, we were &lt;a href="http://twitpic.com/2vot9f" mce_href="http://twitpic.com/2vot9f"&gt;really happy&lt;/a&gt; when we finally got the physical version of our book in our hands.&lt;/p&gt;&lt;p&gt;&lt;img class="alignleft" mce_style="float: left; border: 1px solid black; margin: 10px;" src="http://media.wiley.com/product_data/coverImage/77/04706351/0470635177.jpg" mce_src="http://media.wiley.com/product_data/coverImage/77/04706351/0470635177.jpg" alt="Book front" width="100" height="126" style="border-top-width: 1px; border-right-width: 1px; border-bottom-width: 1px; border-left-width: 1px; border-style: initial; border-color: initial; border-style: initial; border-color: initial; float: left; border-top-style: solid; border-right-style: solid; border-bottom-style: solid; border-left-style: solid; border-top-color: black; border-right-color: black; border-bottom-color: black; border-left-color: black; margin-top: 10px; margin-right: 10px; margin-bottom: 10px; margin-left: 10px; " /&gt;&lt;/p&gt;&lt;p&gt;So let's take a look at what's in this book, what the concept behind it was and give you an overview of the content...&lt;/p&gt;&lt;p&gt;&lt;span mce_name="strong" mce_style="font-weight: bold;" class="Apple-style-span" style="font-weight: bold; "&gt;&lt;u&gt;The concept&lt;/u&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;Given the fact that Maria's book called &lt;a href="https://www.packtpub.com/pentaho-3-2-data-integration-beginners-guide/book" mce_href="https://www.packtpub.com/pentaho-3-2-data-integration-beginners-guide/book"&gt;&lt;span mce_name="em" mce_style="font-style: italic;" class="Apple-style-span" style="font-style: italic; "&gt;Pentaho Data Integration 3.2&lt;/span&gt;&lt;/a&gt; was due when we started, we knew that a beginners guide would be ready by the time that this book was going to be ready.  As such we opted to look at what the data warehouse professional might need when he or she would start to work with Kettle.  Fortunately there is already a good and well known check-list out there to see if you covered everything ETL related and it's called &lt;span mce_name="em" mce_style="font-style: italic;" class="Apple-style-span" style="font-style: italic; "&gt;&lt;a href="http://intelligent-enterprise.informationweek.com/showArticle.jhtml;jsessionid=OTOHMTX3KY2OXQE1GHRSKHWATMY32JVN?articleID=202405400" mce_href="http://intelligent-enterprise.informationweek.com/showArticle.jhtml;jsessionid=OTOHMTX3KY2OXQE1GHRSKHWATMY32JVN?articleID=202405400"&gt;The 34 subsystems of ETL&lt;/a&gt;, &lt;/span&gt;a concept by &lt;a href="http://rkimball.com/" mce_href="http://rkimball.com/"&gt;Ralph Kimball&lt;/a&gt; that was first featured in his book &lt;a href="http://www.amazon.com/Data-Warehouse-Lifecycle-Toolkit-Developing/dp/0471255475" mce_href="http://www.amazon.com/Data-Warehouse-Lifecycle-Toolkit-Developing/dp/0471255475" target="_blank"&gt;The Data Warehouse Lifecycle Toolkit&lt;/a&gt;.  And so we asked Mr Kimballs permission to use his list which he kindly provided.  He was also gracious enough to review the related chapter of our book.&lt;/p&gt;&lt;p&gt;By using this approach we allow the users to flip to a certain chapter in our book and directly get the information they want on the problem they are facing at that time. For example, Change Data Capturing (subsystem 2, a.k.a. CDC) is handled in Chapter 6: Data Extraction.&lt;/p&gt;&lt;p&gt;In other words: we did not start with the capabilities of Kettle. We did not take every step or feature of Kettle as a starting point.  In fact, there are plenty of steps we did not cover in this book.  However, everywhere a step or feature needed to be explained while covering all the sub-systems we did so as clearly as we could.  Rest assured though; since this book handles just about every topic related to data integration, all of the basic and 99% of the advanced features of Kettle are indeed covered in this book ;-)&lt;/p&gt;&lt;p&gt;&lt;span mce_name="strong" mce_style="font-weight: bold;" class="Apple-style-span" style="font-weight: bold; "&gt;&lt;u&gt;The content&lt;/u&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;After a gentle introduction into how ETL tools came about and more importantly how and why Kettle came into existence, the book covers 5 main parts:&lt;/p&gt;&lt;p&gt;&lt;u&gt;1. Getting started&lt;/u&gt;&lt;/p&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;This part starts with the a primer that explains the need for data integration and takes you by the hand into the wonderful world of ETL.&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;Then all the various building blocks of Kettle are explained.  This is especially interesting for folks with prior data integration experience, perhaps with other tools, as they can read all about the design principles and concepts behind Kettle.&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;After that the installation and configuration of Kettle is covered. Since the installation is a simple unzip, that includes a detailed description of all the available tools and configuration files.&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;Finally, you'll get hands-on experience in the last chapter of the first part titled "An example ETL Solution - Sakila".  This chapter explains in great detail how a small but complex data warehouse can be created using Kettle.&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;u&gt;2. ETL&lt;/u&gt;&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;In this part you'll first encounter a detailed overview of the 34 sub-systems of ETL after which the art of Data Extraction is covered in detail.  That includes extracting information from all sorts of file types, databases, working with ERP and CRM systems, Data profilng and CDC.&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;This is followed by chapter 7 "Cleansing and Conforming" in which the various data cleansing and validation steps are covered as well as error handling, auditing, deduplication and last but not least scripting and regular expressions.&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;Finally this second part of the book will cover everything related to star schemas including the handling of dimension tables (chapter 8), loading of fact tables (chapter 9) and working with OLAP data (chapter 10).&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;u&gt;3. Management and deployment&lt;/u&gt;&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;The third main part of the book deals with everything related to the management and deployment of your data integration solution.  First you'll read all about the ETL development lifecycle (chapter 11), scheduling and monitoring (chapter 12), versioning and migration (chapter 13) and lineage and auditing (chapter 14).  As you can guess from the titles of the chapters, a lot of best practices, do's-and-don'ts are covered in this part.&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;u&gt;4. Performance and scalability&lt;/u&gt;&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;The 4th part of our book really dives into the often highly technical topics surrounding performance tuning (chapter 15), parallelization, clustering and partitioning (chapter 16), dynamic clustering in the cloud (chapter 17) and real-time data integration (chapter 18).&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;It's personally hope that the book will lead to more performance related &lt;a href="http://jira.pentaho.com/browse/PDI" mce_href="http://jira.pentaho.com/browse/PDI"&gt;JIRA&lt;/a&gt; cases since chapter 15 explains how you can detect bottlenecks :-)&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;u&gt;5. Advanced topics&lt;/u&gt;&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;The last part conveniently titled "Advanced topics" deals with things we thought were interesting to a data warehouse engineer or ETL developer that is faced with concepts like Data Vault management (chapter 19), handling complex data formats (chapter 20) or web services (chapter 21).  Indispensable in case you want to embed Kettle into your own software is chapter 22 : Kettle integration.  It contains many Java code samples that explain to you how you can execute jobs and transformations or even assemble them dynamically.&lt;/div&gt;&lt;div mce_style="padding-left: 30px;" style="padding-left: 30px; "&gt;Last but certainly not least since it's probably one of the most interesting chapters for a Java developer is chapter 23: Extending Kettle.  This chapter explains to you how you can develop step, job-entry, partitioning or database type plugins for Kettle in great detail so that you can get started with your own components in no time.&lt;/div&gt;&lt;p&gt;I hope that this overview of our new brain-child gives you an idea of what you might be buying into. Since all books are essentially a compromise between page count, time and money I'm sure there will be the occasional typo or lack of precision but rest assured that we did our utmost best on this one.  After all, we did each spend over 6 months on it...&lt;/p&gt;&lt;p&gt;Feel free to ask about specific topics you might be interested in to see if they are covered ;-)&lt;/p&gt;&lt;p&gt;Until next time,&lt;/p&gt;&lt;p&gt;Matt&lt;/p&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-2891323398847433837?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177' title='Pentaho Kettle Solutions Overview'/><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/2891323398847433837/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/10/pentaho-kettle-solutions-overview.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2891323398847433837'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2891323398847433837'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/10/pentaho-kettle-solutions-overview.html' title='Pentaho Kettle Solutions Overview'/><author><name>Matt Casters</name><uri>http://www.blogger.com/profile/12263548900215476529</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp2.blogger.com/_cHg9rXowGtw/R6DPxIh4iEI/AAAAAAAAAAQ/a2PKY2431ys/S220/MattGardenSmall.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-3680485619344452402</id><published>2010-10-04T21:54:00.000+02:00</published><updated>2010-10-04T21:54:06.121+02:00</updated><title type='text'>Twitter real-time BI</title><content type='html'>&lt;div style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none;"&gt;Something which a lot of you people already knew, but which I just discovered, is that my (and your) Tweets are being watched. Not in 'big brother is watching you', but rather as 'big marketeer is studying you'.&lt;br /&gt;&lt;br /&gt;What happened:&lt;br /&gt;&lt;br /&gt;This week I made a quick twitter reply to the following post of @magicaltrout in which he&amp;nbsp;was complaining his employer didn't pay him (yet).&lt;/div&gt;&lt;div style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none;"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TKoqQNXACgI/AAAAAAAAAdk/QXdSDfROfns/s1600/pic220.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="65" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TKoqQNXACgI/AAAAAAAAAdk/QXdSDfROfns/s400/pic220.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;I made a quick reply to this that went as follows:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TKosGD5CtkI/AAAAAAAAAdo/D6PMEN3UA8A/s1600/pic221.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="62" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TKosGD5CtkI/AAAAAAAAAdo/D6PMEN3UA8A/s400/pic221.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;Next thing that happened - within minutes - &lt;b&gt;I had a new follower on Twitter&lt;/b&gt;. Check this.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TKou7UIyFhI/AAAAAAAAAdw/HrDPuAUNTK0/s1600/pic222.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="268" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TKou7UIyFhI/AAAAAAAAAdw/HrDPuAUNTK0/s400/pic222.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;Clearly some companies scan twitter to figure out whether you are a potential customer, then add you to their followers list, probably with the idea of following up even closer, or to start spamming you.&lt;br /&gt;&lt;br /&gt;When I looked into the Twitter account, it actually belonged to a Fastmkn.com, a company that actually sets up this kind of services for their customers. Their website was clear about their services:&lt;/div&gt;&lt;div class="separator" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TKotWlW9t4I/AAAAAAAAAds/YUDXfN-wSe8/s1600/pic223.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="90" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TKotWlW9t4I/AAAAAAAAAds/YUDXfN-wSe8/s400/pic223.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;So, when you tweet, know your tweets are being 'mined', watch out what your write, and make sure you have your hand on the anti-spam button.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://stylosoft.com/wp-content/uploads/2010/05/twitter_block-150x150.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://stylosoft.com/wp-content/uploads/2010/05/twitter_block-150x150.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; clear: both; text-align: left;"&gt;﻿&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-3680485619344452402?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/3680485619344452402/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/10/twitter-real-time-bi.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3680485619344452402'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3680485619344452402'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/10/twitter-real-time-bi.html' title='Twitter real-time BI'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_C0PnWJwDRZY/TKoqQNXACgI/AAAAAAAAAdk/QXdSDfROfns/s72-c/pic220.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6689855952075185115</id><published>2010-10-03T16:28:00.003+02:00</published><updated>2010-10-19T00:11:54.807+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>Pi</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://lh6.ggpht.com/_91SFbopY8BI/SrB0g_tiGoI/AAAAAAAACWk/tA8uqp9q1-8/math-jokes.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://lh6.ggpht.com/_91SFbopY8BI/SrB0g_tiGoI/AAAAAAAACWk/tA8uqp9q1-8/math-jokes.jpg" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6689855952075185115?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6689855952075185115/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/10/pi.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6689855952075185115'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6689855952075185115'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/10/pi.html' title='Pi'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh6.ggpht.com/_91SFbopY8BI/SrB0g_tiGoI/AAAAAAAACWk/tA8uqp9q1-8/s72-c/math-jokes.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-5459283308086222390</id><published>2010-09-30T16:37:00.002+02:00</published><updated>2010-09-30T21:20:44.388+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - KFF'/><title type='text'>PCG10 KFF presentation</title><content type='html'>&lt;div style="text-align: justify;"&gt;Unfortunately I wasn't able to blog my own presentation live during the &lt;b&gt;&lt;a href="http://kjube.blogspot.com/2010/09/pentaho-community-gathering-live.html"&gt;Pentaho Community Gathering in Cascais&lt;/a&gt;&lt;/b&gt;, Portugal (PCG10), last saturday. The Live Blog page drew a lot of attention, during and after the event - statistics will follow - and I even got a few times the question whether I would still write &lt;b&gt;a summary of the KFF presentation&lt;/b&gt; to go with the &lt;b&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen.pdf"&gt;slides&lt;/a&gt;&lt;/b&gt;. &lt;i&gt;&lt;b&gt;Well, I will do no such thing!&lt;/b&gt;&lt;/i&gt; Instead however ...&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Since the whole objective of our presentation was to somehow "launch" KFF - except for a handfull of insiders, no one within the Pentaho Community heard about KFF before - I taught it would be worthwile to write a full walk-through of the presentation for all the persons that might visit the 'PCG10 Live Blog'. So here it goes, &lt;b&gt;no summary, but the full presentation in blog format,&lt;/b&gt; plus some little extra's at the end. Enjoy.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-size: large;"&gt;&lt;u&gt;&lt;b&gt;KFF, as presented at the Pentaho Community Gathering 2010&lt;/b&gt;&lt;/u&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF01.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;As the title slide of my presentation suggests, KFF is all about Pentaho Data Integration, often better known as kettle. KFF has ambitions to be an exciting addition to the existing toolset kettle, spoon, kitchen (all clearly visible in the picture) and at the same time be a stimulator for improvement of these tools.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;u&gt;&lt;b&gt;Why oh why?&lt;/b&gt;&lt;/u&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Any, and I mean any, consultant that has worked at least once with a data integration tool, be it Informatica, Datastage, MS Integration Services, Business Objects Data Integration, Talend (somehow forgot to name this one at PCG10), Sunopsis - Oracle Warehouse Builder - Oracle Data Integration, has been confronted with the fact that some elementary things are not available out of the box in any of these tools. I think about:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A job/transformation logging&amp;nbsp; without set-up or configuration&lt;/li&gt;&lt;li&gt;Integrated alerting mechanisms (for when things go wrong)&lt;/li&gt;&lt;li&gt;Integrated reporting&amp;nbsp;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;as part of the alerting or&amp;nbsp;&lt;/li&gt;&lt;li&gt;just to understand the health of your data integration server&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Guideliness for a multi-environment (DEV, TST, UAI, PRD) set-up&amp;nbsp;&lt;/li&gt;&lt;li&gt;Easy code versioning and migration between environments&lt;/li&gt;&lt;li&gt;Automated archiving of your code&lt;/li&gt;&lt;li&gt;... etc &lt;/li&gt;&lt;/ul&gt;After some years - too many I would say - I came to the conclusion that whatever the data integration technology, I was always rewriting the same concept over and over again. And all ustomers have seemed more than happy with the "frameworks" I've built. So I started wondering, how it was possible that data integration vendors were not covering the above requirements with a standard solution, if the requirements are the same across all customers. I took the discussion up with Matt, and he felt the same.&lt;br /&gt;&lt;br /&gt;Once we realized this, re-implementing the same concepts again and again became hard to bare.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF02.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF02.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Luckily Matt and myself had the chance to do a lot of projects together, using kettle, and we started building something we could re-use on our projects back in 2005. With every new project we did with &lt;a href="http://www.kjube.be/"&gt;kJube&lt;/a&gt;, our 'solution' grew, and we got more and more conviced that we needed to share this.&lt;br /&gt;&lt;br /&gt;So in June 2010 we listed out all we had and decided to clean the code  and package a first version of what we had to show and share on the  Pentaho Community Gathering. &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF03.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF03.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;We noticed soon that the first version couldn't include nearly all we had ready. What we present at PCG10, is just a basic version to show you all what were doing. The whole release schedule willt ake until january 2011, if new additions or change requests interfere.&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;b&gt;So what is KFF?&lt;/b&gt;&lt;/u&gt;&lt;br /&gt;&lt;u&gt;&lt;br /&gt;&lt;/u&gt;&lt;br /&gt;We decided to call our solution the Kettle Franchising Factory.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Franchising&lt;/b&gt; seemed a nice term because that remained nicely within the existing kettle, spoon, kitchen, chef, carte, etc metaphor. It indicates that the KFF objective is to scale up your data integration&amp;nbsp; restaurant to multiple 'locations' where you cook the same food. That's basically what we want. Make kettle deployments &lt;b&gt;multi environment, multi customer, whilst keeping the set-up standard&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;The term Factory refers to the fact that we want every part of the process to go as&amp;nbsp; speedy and automatic as possible. This factory contains all the tools to deploy kettle solutions as swift as possible.&lt;u&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/u&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF04.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF04.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The tools through which we reach those goals are several:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Some of the requirements we meet through proposing set-up &lt;b&gt;standards&lt;/b&gt;. We try to make as few things dependend on standards or &lt;b&gt;guideliness&lt;/b&gt;, everything should be configurable, but large data integration deployments stay neat and clean only if some clear set-up standards are respected. Also, standards on parametrization need to be imposed if you want to make your code flexible enough to run on multiple environments without further modifications.&lt;/li&gt;&lt;li&gt; A lot of functionality is implemented using &lt;b&gt;reusable kettle jobs and transformations&lt;/b&gt;, often using named variables. &lt;/li&gt;&lt;li&gt;Quite a few &lt;b&gt;kettle plugins&lt;/b&gt; have been written too. We believe that when certain actions can be simplified by providing a kettle plugin, that we should provide that plugin.&lt;/li&gt;&lt;li&gt;Up to now we have 4 project templates we want to include with the KFF. Some "projects" always have the same structure if one follows best practices, so why should we rewrite things.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Scripting&lt;/b&gt;. Although limited, there is also some scripting involved in KFF.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;u&gt;&lt;b&gt;So let's go into details &lt;/b&gt;&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;A first element of the KFF we want to show is the 'batch_launcher.kjb'. This kettle job is designed to be a wrapper around your existing ETL code or one of the templates we'll ship with KFF. The objective is make all calls to re-usable logic as logging, archiving etc in this wrapper without the need to modify your code.&lt;br /&gt;&lt;br /&gt;What does this job do (as of today):&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The first step of this job will read the right configuration file(s) for your current project/environment. For this we've developped a step called the 'environment configurator'. So based upon some input parameters, the environment configurator will override any variables that (might) have been see in kettle.properties to ensure that the right variables are used.&lt;/li&gt;&lt;li&gt;The job 'kff_logging_init' will&lt;/li&gt;&lt;ol&gt;&lt;li&gt;create logging tables (in case they didn't exist yet), currently on MySQL or Oracle,&lt;/li&gt;&lt;li&gt;clean up logging tables in case there should be data in there &lt;/li&gt;&lt;li&gt;check whether the previous run for this project finished (succesfully)&lt;/li&gt;&lt;li&gt;creates a 'batch run' &lt;/li&gt;&lt;/ol&gt;&lt;li&gt;The next job calls one of our project templates currently the datawarehouse template but can easily be replaced by the top level job of your data integration project&lt;/li&gt;&lt;li&gt;After the data integration code has finished, 'kff_logging_reports' generates standard reports on top of the logging tables . The reports are kept with the kitchen logs.&lt;/li&gt;&lt;li&gt;'kff_logging_archive'&amp;nbsp;&lt;/li&gt;&lt;ol&gt;&lt;li&gt;closes the 'batch_run' based on results in the logging tables and&lt;/li&gt;&lt;li&gt;archives the logging tables (more on that later)&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;'kff_backup_code' makes a zip file of the data integration code which is tagged with the same batch_run_id as the kitchen log file and the generated reports.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF05.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF05.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;i&gt;How does the environment configuration work, and why is it necessary?&lt;/i&gt;&lt;/b&gt; Well, the fact kettle standard only provides one kettle.properties file in which to put all your parameters is kind of limiting to setting up a multi-environment kettle project. The way you actually switch between environments in a flexible way is actually by changing the content of variables.&amp;nbsp;So we created the environment configurator. I'm not gonna elaborate on this again, since &lt;a href="http://kjube.blogspot.com/2010/08/kff-environment-configurator.html"&gt;I've blogged about this plug-in in august&lt;/a&gt; when we first released it. I believe that blog-post elaborates more than enough on the usage of this step.&lt;br /&gt;&lt;br /&gt;Obviously the environment configurator is something that works when you execute code through kitchen, that is in batch mode. However whenever you fire up spoon, it will just read the properties files in your $KETTLE_HOME directory. In order to overcome the problem also in the development interface.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF06.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF06.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;Consequently, if you have correctly set up your configuration files, the kff_spoon_launcher.sh [No windows script available yet. We do accept contributions from people running Windows as OS.] will automatically set the right configuration files at run time and fire up spoon on the environment you want. As a little addition, nothing more than a little hack, we also change the background of your kettle canvas. That way you see whether you are logged on in DEV, TST, UAI or PRD, which is good to know when you want to launch some code from the kettle interface.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;So how about that system to create logging tables?&lt;/b&gt;&lt;span class="Apple-style-span" style="font-style: normal;"&gt;&amp;nbsp;Well, the logging tables we use are the standard job, transformation and step logging tables. We tried to stick as much to the existing PDI logging and just add on top of that.&amp;nbsp;&lt;/span&gt;&lt;/i&gt;&lt;br /&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;br /&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal;"&gt;What did we add:&lt;/span&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal;"&gt;We implemented the concept of a &lt;b&gt;batch logging&lt;/b&gt; table. For every time you launch a batch process, in this table a record will be logged that covers your whole batch run. In casu it will log the execution of the top level job. So yes, this is nothing but job logging, but since the top level job has a specific meaning within a batch process, isolating it's logging opens up possibilities.&lt;/span&gt;&lt;/i&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal;"&gt;We also implemented the concept of a &lt;b&gt;rejects logging&lt;/b&gt; table. Kettle has great error handling, however one feature we felt was missing is to standardize that error handling. Our reject plug-in merges all records that have been rejected by an output step into a common format and inserts them into our reject logging table. The full records is preserved, so information could theoretically be reprocessed later. [Question Pedro Alvez: "Is the reprocessing part of KFF? Answer: No, since we don't believe automation of that is straight forward enough.]&lt;/span&gt;&lt;/i&gt;&lt;/li&gt;&lt;li&gt;&lt;b&gt;Logging tables are created on the fly&lt;/b&gt;. Why? Well, whenever you are running your jobs/transformations on a new environment you get that nasty errors that your logging tables don't exist. Why should you be bothered with that. If they don't exist, we create them.&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF07.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF07.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;/div&gt;&lt;ul&gt;&lt;li&gt;Creating the logging tables on the fly wasn't just done because we like everything to go automatically. Suppose you would want to&amp;nbsp;&lt;b&gt;run two batch processes in parallel&lt;/b&gt;. In a set-up with a single set of logging tables your logging information would get mixed up. Not in our set-up. You can simply define a differrent set of logging tables for the second batch run and your logging stays nicely separated.&amp;nbsp;&lt;/li&gt;&lt;li&gt;Obviously to implement the above, you need to be able to change your logging settings in your jobs and transformations at run-time. For this Matt has written some nifty&amp;nbsp;&lt;b&gt;logging parameters injection code&lt;/b&gt;&amp;nbsp;that actually injects the log table information and log connection into the jobs and transformations. More about that on the next slide.&amp;nbsp;&lt;/li&gt;&lt;li&gt;At the end of the batch run we also&amp;nbsp;&lt;b&gt;archive logging information&lt;/b&gt;. Even if you have been using different sets of logging tables, all information is merged back together, allowing historical reporting on your data integration processes. Also, the archive tables avoid that your logging tables fill up and make the kettle development interface become sluggish when visualizing the logging.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF08.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF08.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The rejects step isn't the only plug-in we have written over the last few years. The next slide illustrates some other steps that have been developed.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Trim strings&lt;/li&gt;&lt;li&gt;Date-time calculator&lt;/li&gt;&lt;li&gt;Table compare&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;&lt;a href="http://kjube.blogspot.com/2010/09/kff-slowly-coming-out-of-kitchen-closet.html"&gt;I have blogged about these steps before, so again, I will not write this out again.&lt;/a&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;&amp;nbsp;Also, one step that isn't mentioned here, but which we developed too, and has been contributed back to kettle 4.0 is the &lt;a href="http://kjube.blogspot.com/2010/08/data-grid-step.html"&gt;&lt;b&gt;data grid step&lt;/b&gt;&lt;/a&gt;.&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF09.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF09.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;i&gt;Another aspect of KFF are project templates&lt;/i&gt;&lt;/b&gt;. For the moment, what we have is rather meager - only the datawarehouse template is available -, but we do have quite some stuff in the pipeline that we want to deploy.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The &lt;b&gt;datawarehouse template&lt;/b&gt; should grow out to be a 'sample' datawarehouse project containing lots of best practices and possibly a lot of reusable dimensions (as in date dimension, time dimension, currency dimension, country dimension, ...)&amp;nbsp;&lt;/li&gt;&lt;li&gt;The data vault generator is a contribution from Edwin Weber which came to us through Jos van Dongen. We are still looking into how we can add it. But it seems promissing.&lt;/li&gt;&lt;li&gt;The &lt;b&gt;campaign manager&lt;/b&gt; is a mailing application, also know as norman-mailer, which we use internally at kJube. It allows you to easily read out a number of email addresses, send mails, and capture reponses from POP3.&lt;/li&gt;&lt;li&gt;The &lt;b&gt;db-compare template&lt;/b&gt;&amp;nbsp;does an automatic compare of the data in a list of tables in two databases. It will log all differences between the data in the two tables. It is something we've used for UAI testing when we need to prove our customer that UAI and PRD are alligned.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF10.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF10.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;After the presentation Roland Bouman came to me with a great idea for another template. I will not reveal anything as he has his hands full with the cookbook for the time being, and we are busy with KFF. When the time is ripe, you'll hear about this template too.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;So to sum it all up: KFF pretends to be a big box, with all of the below contents.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF11.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF11.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;We don't expect all of this to be in there from day one. Actually the day we presented KFF at PCG10 was day 1, so have some patience and let us add what we have over the next months.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;u&gt;How will KFF move forward?&lt;/u&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Well we believe the first step was releasing something to the community. We'll keep on doing that. The code for the project is fully open source (GPL) and available no Google Code. Check &lt;a href="http://code.kjube.be/"&gt;code.kjube.be&lt;/a&gt; or go to the kettle-franchising project on google code. We'll listen to your feedback and adapt where possible!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF12.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF12.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Also we'll follow these basic guidelines:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Make KFF deployment as simple as possible. As is simple as a kettle deploy is impossible since kettle is deployed within KFF, but if you know kettle, you know what we mean.&lt;/li&gt;&lt;li&gt;We also believe that some of the functionality we have built doesn't belong in KFF but rather in kettle itself. We'll push those things back to Pentaho. (We'll try to find time to discuss with the kettle architect :-) )&lt;/li&gt;&lt;li&gt;If and when something should be automated/simplified in a plug-in we'll do so.&amp;nbsp;&lt;/li&gt;&lt;li&gt;We believe we should integrate with other project around kettle, as the cookbook.&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;br /&gt;&lt;b&gt;&lt;u&gt;Who's in on this?&lt;/u&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For the moment, Matt and myself are the drives behind this "project". Older version have been in production with some of kJube's customers for years. Sometimes they contribute, sometimes they are just happy it does what it should.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF13.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF13.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;We hope to welcome a lot of users and contributers over next months.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;u&gt;Feedback&lt;/u&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;is always welcome!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF14.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF14.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Thanks!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF15.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/presentations/PCG10_JanAertsen_KFF15.png" width="570" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-5459283308086222390?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/5459283308086222390/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-kff-presentation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5459283308086222390'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5459283308086222390'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-kff-presentation.html' title='PCG10 KFF presentation'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-2281903092043350570</id><published>2010-09-29T16:58:00.002+02:00</published><updated>2010-09-30T21:32:56.334+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>PCG10 on PCG11</title><content type='html'>&lt;div style="text-align: justify;"&gt;Since I've been blogging so much on PCG10, I thought it was worth to quickly gather all the ideas that I heard on PCG11 at PCG10. Probably I was able to catch only a fraction of them, but I count on the community to add the rest of the ideas as comments (or on the Pentaho Forums and Wiki, where they belong in order to get a good discussion going).&lt;/div&gt;&lt;br /&gt;&lt;b&gt;Improvement idea nbr 1: shorter twitter tag&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;I'm not sure if this idea needs much explanation or even discussion? The hash tag PentahoMeetup10 was annoyingly long for people tweeting from the room. If the name of the Pentaho Community Gathering doesn't change (see improvement idea 2) I vote for #PCG11.&lt;/div&gt;&lt;br /&gt;&lt;b&gt;Improvement idea nbr 2: rename the Pentaho Community Gathering&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;I've heard rumours (@julianhyde) that the name Pentaho Open World would become the new name. I'm not sure if this is a strategic hint to L. Ellison to make Oracle buy Pentaho. But then again, Julian&amp;nbsp; knows more about the company strategy than I do.&lt;/div&gt;&lt;br /&gt;&lt;b&gt;Improvement idea nbr 3: location&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;THE QUESTION at PCG10 was: "How will we ever top this location?". Indeed Cascais and surroundings were an amazing location! The hotel where the event was held, with a view over the sunny beach was fantastic! A tower of cupcakes and sweets in the room! Private espresso machines for PCG10! Nice little restaurants with fresh fish dishes right outside the door for a great lunch break! Webdetails and Xpand-IT even managed to organize a cloudless sky and +25°C temperatures.&amp;nbsp;&lt;/div&gt;&lt;br /&gt;The question to ask might well be: "Will any one dare to organize PCG11?"&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;So how go about setting up PCG11? Do a poll in the community on prefered locations? Asking different contenders for a proposal and let the best win? I don't have the answer, but the organizer of PCG11 will have a hard time topping PCG10, that is for sure.&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;A name that came up quite a lot for PCG11, was Brazil. Quite a few Brazilians have been following the live blog, and it seems that there's a very active Pentaho Community out there. But Summer in Brazil is coming real soon, which brings us to improvement idea nbr 4. &lt;/div&gt;&lt;br /&gt;&lt;b&gt;Improvement idea nbr 4: timing&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Another interesting idea mentioned was to go from a yearly to a half-yearly event. That would perfectly fit with PCG switches between the Northern and Southern hemisphere. We could have a PCG11-South and a PCG11-North. Or Up and Down? Or ... well whatever.&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Improvement idea nbr 5: presentation formats&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;On PCG10, all 15 presentations were either slideshows and/or demo's with exception of Dan's (codek1) "presentation". He just went up front with a short prepared speech actually interviewing the audience on methodology. That resulted into a very interesting groups discussion.&amp;nbsp;&lt;/div&gt;&lt;br /&gt;So this might actually lead to an idea for varying the format of presentations. Some idea's I heard:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;groups discussions on a specific topic (who's interested can participate)&lt;/li&gt;&lt;li&gt;a "what sucks session" (proposed by Grumpy)&lt;/li&gt;&lt;li&gt;architecture sessions around a white board&lt;/li&gt;&lt;/ul&gt;There are many possibilities, and with 15 presentations in one day (a heavy schedule) a change of format is most welcome.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Improvement idea nbr 6: extending PCG to a full OSBI event&lt;/b&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Based on Aaron's presentation at PCG10, where the idea was discussed that Pentaho BI server is more and more becoming a Pentaho BI APPLICATION server, which might/will also support JasperReports, BIRT, etc the discussion to invite also these open source projects to the community table. PCG would then become a true open source BI event. Personally I find that a very challenging idea, however it does raise some practical questions. PCG is growing quickly as it is, adding even more momentum would make organisation of the event just too tough for any of the partners to take on? How to balance the agenda between Pentaho / Non-Pentaho stuf, after all it's a Pentaho sponsored event? &amp;nbsp; &lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;  &lt;br /&gt;&lt;b&gt;&lt;i&gt;&lt;span style="color: #4c1130;"&gt;So, that's it, I've posted whatever I remembered from the nice talks with great people, if a faboulous surrounding, accompanied wit great food, wine and beer. It's the best way these ideas won't fade. Most of these ideas come from great minds and fine people in the community. I hope posting them helps to stimulate the discussion. One thing is sure: PCG will keep on getting better.&lt;/span&gt;&lt;/i&gt;&lt;/b&gt;  &lt;br /&gt;&lt;ul&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-2281903092043350570?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/2281903092043350570/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-on-pcg11.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2281903092043350570'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2281903092043350570'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-on-pcg11.html' title='PCG10 on PCG11'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-3120626528693787745</id><published>2010-09-29T14:59:00.017+02:00</published><updated>2010-10-13T16:06:07.788+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>PCG10 participants</title><content type='html'>Three days after the event, some mails are going around, trying to reconstruct the "who was who". Indeed the Pentaho Community Event is growing, and I believe many discovered only when seeing the group picture that they didn't get round to meet quite a few of the participants.&lt;br /&gt;&lt;br /&gt;I discovered too that I missed to opportunity to get to know some people. So based on the mails that have been circulating and a 'tagged' group picture (thank you Jens Bleuel) I'm trying to put together the PCG10 Participant list.&lt;br /&gt;&lt;br /&gt;It is work in progress, so please people, help me sticking the right name to the right person.&amp;nbsp;Also, should any one rather remain anonymous, drop me a mail, I'll rename you to Mr X(action).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.kjube.be/images/PCG10_GroupPictureTagged.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://www.kjube.be/images/PCG10_GroupPictureTagged.jpg" width="560" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;table border="1" style="width: 600px;"&gt;&lt;tbody&gt;&lt;tr&gt;   &lt;td&gt;Nbr&lt;/td&gt;   &lt;td&gt;Name (First name - Last name)&lt;/td&gt;   &lt;td&gt;Twitter&lt;/td&gt; &lt;td&gt;from&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;1&lt;/td&gt;   &lt;td&gt;Roland Bouman&lt;/td&gt;   &lt;td&gt;@rolandbouman&lt;/td&gt; &lt;td&gt;.nl&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;2&lt;/td&gt;   &lt;td&gt;Matt Casters&lt;/td&gt;   &lt;td&gt;@mattcasters&lt;/td&gt;  &lt;td&gt;.be&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;3&lt;/td&gt;   &lt;td&gt;Håkon Torjus Bommen&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.no&lt;/td&gt;  &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;4&lt;/td&gt;   &lt;td&gt;Marco Gomes&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;5&lt;/td&gt;   &lt;td&gt;Jos van Dongen aka Grumpy&lt;/td&gt;   &lt;td&gt;@josvandongen&lt;/td&gt; &lt;td&gt;.nl&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;6&lt;/td&gt;   &lt;td&gt;Carlos Amorim&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.es&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;7&lt;/td&gt;   &lt;td&gt;Jens Bleuel&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.de&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;8&lt;/td&gt;   &lt;td&gt;Pedro Alves&lt;/td&gt;   &lt;td&gt;@pmalves&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;9&lt;/td&gt;   &lt;td&gt;Nikolai Sandved&lt;/td&gt;   &lt;td&gt;@NikolaiSandved&lt;/td&gt; &lt;td&gt;.no&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;10&lt;/td&gt;   &lt;td&gt;Gunter Rombauts&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.be&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;11&lt;/td&gt;   &lt;td&gt;Jochen Olejnik&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.de&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;12&lt;/td&gt;   &lt;td&gt;Tom Barber&lt;/td&gt;   &lt;td&gt;@magicaltrout&lt;/td&gt; &lt;td&gt;.uk&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;13&lt;/td&gt;   &lt;td&gt;Nuno Brites&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;14&lt;/td&gt;   &lt;td&gt;David Duque&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;15&lt;/td&gt;   &lt;td&gt;Slawomir Chodnicki&lt;/td&gt;   &lt;td&gt;@slawo_ch&lt;/td&gt; &lt;td&gt;.de&lt;/td&gt;  &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;16&lt;/td&gt;   &lt;td&gt;Pedro Pinheiro&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;17&lt;/td&gt;   &lt;td&gt;Nelson Sousa&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;18&lt;/td&gt;   &lt;td&gt;Nuno Severo&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;19&lt;/td&gt;   &lt;td&gt;Paula Clemente&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt;  &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;20&lt;/td&gt;   &lt;td&gt;Pedro Martins&lt;/td&gt;   &lt;td&gt;&lt;/td&gt;  &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;21&lt;/td&gt;   &lt;td&gt;Samatar Hassan&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.fr&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;22&lt;/td&gt;   &lt;td&gt;André Simões&lt;/td&gt;   &lt;td&gt;@ITXpander&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;23&lt;/td&gt;   &lt;td&gt;Rui Gonçalves&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.es&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;24&lt;/td&gt;   &lt;td&gt;Dan Keeley&lt;/td&gt;   &lt;td&gt;@codek1&lt;/td&gt; &lt;td&gt;.uk&lt;/td&gt;  &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;25&lt;/td&gt;   &lt;td&gt;Anthony Carter&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.ir&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;26&lt;/td&gt;   &lt;td&gt;Julian Hyde&lt;/td&gt;   &lt;td&gt;@julianhyde&lt;/td&gt; &lt;td&gt;.us&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;27&lt;/td&gt;   &lt;td&gt;Rob van Winden&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.nl&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;28&lt;/td&gt;   &lt;td&gt;Pompei Popescu&lt;/td&gt;   &lt;td&gt;&lt;/td&gt;  &lt;td&gt;.ro&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;29&lt;/td&gt;   &lt;td&gt;Jan Aertsen&lt;/td&gt;   &lt;td&gt;@jan_aertsen&lt;/td&gt; &lt;td&gt;.be&lt;/td&gt;  &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;30&lt;/td&gt;   &lt;td&gt;Ingo Klose&lt;/td&gt;   &lt;td&gt;@i_klose&lt;/td&gt; &lt;td&gt;.de&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;31&lt;/td&gt;   &lt;td&gt;Sergio Ramazzina&lt;/td&gt;   &lt;td&gt;@serasoftitaly&lt;/td&gt; &lt;td&gt;.it&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;32&lt;/td&gt;   &lt;td&gt;Martin Stangeland&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.no&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;33&lt;/td&gt;   &lt;td&gt;Dragos Matea&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.ro&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;34&lt;/td&gt;   &lt;td&gt;Juan José Ortilles&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.es&lt;/td&gt;  &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;35&lt;/td&gt;   &lt;td&gt;Paul Stoellberger&lt;/td&gt;   &lt;td&gt;@pstoellberger&lt;/td&gt; &lt;td&gt;.at&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;36&lt;/td&gt;   &lt;td&gt;Doug Moran aka Caveman&lt;/td&gt;   &lt;td&gt;@doug_moran&lt;/td&gt; &lt;td&gt;.us&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;37&lt;/td&gt;   &lt;td&gt;Thomas Morgner&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.us&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;38&lt;/td&gt;   &lt;td&gt;Kees Romijn&lt;/td&gt;   &lt;td&gt;&lt;/td&gt; &lt;td&gt;.nl&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td align="right"&gt;39&lt;/td&gt;   &lt;td&gt;Aaron Phillips&lt;/td&gt;   &lt;td&gt;@phytodata&lt;/td&gt; &lt;td&gt;.us&lt;/td&gt; &lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;... and the following are people that were present, but somehow dropped out of the group picture. Maybe they were on the beach? &lt;br /&gt;&lt;br /&gt;&lt;table border="1" style="width: 600px;"&gt;&lt;tbody&gt;&lt;tr&gt; &lt;td&gt;Picture&lt;/td&gt; &lt;td&gt;Name (First name - Last name)&lt;/td&gt; &lt;td&gt;Twitter&lt;/td&gt; &lt;td&gt;from&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt; &lt;td&gt;&lt;img src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ4Wak7SyPI/AAAAAAAAAcA/RAF_HymqRg8/s1600/IMAG0658.jpg" width="90" /&gt;&lt;/td&gt; &lt;td&gt;Nuno Moreira&lt;/td&gt; &lt;td&gt;@webdetails&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;&lt;img src="http://a0.twimg.com/profile_images/263768676/6a00d10a7de6c58bfa00cdf3af4aeccb8f-50si.jpg" width="90" /&gt;&lt;/td&gt;   &lt;td&gt;Bart Maertens&lt;/td&gt;   &lt;td&gt;@bartmaer&lt;/td&gt; &lt;td&gt;.be&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;&lt;img src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ42OeJvGfI/AAAAAAAAAcY/svPBsyuc7Wc/s1600/IMAG0662.jpg" width="90" /&gt;&lt;/td&gt; &lt;td&gt;Juliana Alves&lt;/td&gt; &lt;td&gt;&lt;/td&gt; &lt;td&gt;.pt&lt;/td&gt; &lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-3120626528693787745?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/3120626528693787745/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-participants.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3120626528693787745'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3120626528693787745'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-participants.html' title='PCG10 participants'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ4Wak7SyPI/AAAAAAAAAcA/RAF_HymqRg8/s72-c/IMAG0658.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-1040732150817183043</id><published>2010-09-28T12:33:00.002+02:00</published><updated>2010-09-30T21:33:58.838+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>PCG10 in pictures</title><content type='html'>After blogging PCG10, Doug kindly asked me whether I could also host the event pictures. So I've quickly added a picture gallery on the &lt;a href="http://pictures.kjube.be/"&gt;kJube website&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;There are many more participants out there that took pictures, may I ask every one to mail me their pics, or preferably a link to an archive or so. I'll make sure it all ends up in the gallery. &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I've created the following categories:&lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;1) The day before&lt;/b&gt;: Most people arrived the day or evening before PCG10 in Cascais. I've added those pictures in here.&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;object data="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=1&amp;amp;galleryLocation=pictures&amp;amp;domainName=kjube.be" height="348" name="embedFlashGallery" type="application/x-shockwave-flash" width="464"&gt;&lt;param name="movie" value="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=1&amp;galleryLocation=pictures&amp;domainName=kjube.be"/&gt;&lt;param name="quality" value="high"/&gt;&lt;param name="bgcolor" value="#000000"/&gt;&lt;param name="allowScriptAccess" value="always"/&gt;&lt;param name="allowFullScreen" value="true"/&gt;&lt;a href="http://pictures.kjube.be/#1"&gt;http://pictures.kjube.be/#1&lt;/a&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;2) PCG10&lt;/b&gt;: All picture from the Pentaho Community Gathering in Hotel Albatroz, including some pictures of the 2 hour lazy lunch break. &lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;object data="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=2&amp;amp;galleryLocation=pictures&amp;amp;domainName=kjube.be" height="348" name="embedFlashGallery" type="application/x-shockwave-flash" width="464"&gt;&lt;param name="movie" value="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=2&amp;galleryLocation=pictures&amp;domainName=kjube.be"/&gt;&lt;param name="quality" value="high"/&gt;&lt;param name="bgcolor" value="#000000"/&gt;&lt;param name="allowScriptAccess" value="always"/&gt;&lt;param name="allowFullScreen" value="true"/&gt;&lt;a href="http://pictures.kjube.be/#2"&gt;http://pictures.kjube.be/#2&lt;/a&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;3) Saturday night&lt;/b&gt;: After being exposed to a vast amount of presentation, PCG participants break loose. &lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;object data="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=3&amp;amp;galleryLocation=pictures&amp;amp;domainName=kjube.be" height="348" name="embedFlashGallery" type="application/x-shockwave-flash" width="464"&gt;&lt;param name="movie" value="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=3&amp;galleryLocation=pictures&amp;domainName=kjube.be"/&gt;&lt;param name="quality" value="high"/&gt;&lt;param name="bgcolor" value="#000000"/&gt;&lt;param name="allowScriptAccess" value="always"/&gt;&lt;param name="allowFullScreen" value="true"/&gt;&lt;a href="http://pictures.kjube.be/#3"&gt;http://pictures.kjube.be/#3&lt;/a&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;4) Bowling and later&lt;/b&gt;: Sunday morning a bowling event was planned (where large amounts of coffee were consumed). After that many people drifted of to Sintra or just hung out at Cascais &lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;object data="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=4&amp;amp;galleryLocation=pictures&amp;amp;domainName=kjube.be" height="348" name="embedFlashGallery" type="application/x-shockwave-flash" width="464"&gt;&lt;param name="movie" value="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=4&amp;galleryLocation=pictures&amp;domainName=kjube.be"/&gt;&lt;param name="quality" value="high"/&gt;&lt;param name="bgcolor" value="#000000"/&gt;&lt;param name="allowScriptAccess" value="always"/&gt;&lt;param name="allowFullScreen" value="true"/&gt;&lt;a href="http://pictures.kjube.be/#4"&gt;http://pictures.kjube.be/#4&lt;/a&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;5) Cascais and surroundings&lt;/b&gt;: Sceneries of Cascais. An amazing location for PCG10.&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;object data="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=0&amp;amp;galleryLocation=pictures&amp;amp;domainName=kjube.be" height="348" name="embedFlashGallery" type="application/x-shockwave-flash" width="464"&gt;&lt;param name="movie" value="http://iloapp.kjube.be/gallery/swf/embedFlashGallery.swf?albumId=0&amp;galleryLocation=pictures&amp;domainName=kjube.be"/&gt;&lt;param name="quality" value="high"/&gt;&lt;param name="bgcolor" value="#000000"/&gt;&lt;param name="allowScriptAccess" value="always"/&gt;&lt;param name="allowFullScreen" value="true"/&gt;&lt;a href="http://pictures.kjube.be/#0"&gt;http://pictures.kjube.be/#0&lt;/a&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;With many thanks to the current contributers:&lt;br /&gt;&lt;ul&gt;&lt;li&gt; Kees Romijn&lt;/li&gt;&lt;li&gt;Jens Bleuel&lt;/li&gt;&lt;li&gt;Jan Aertsen&lt;/li&gt;&lt;/ul&gt;&lt;div style="color: #990000;"&gt;&lt;b&gt;There are many more participants out there that took pictures, may I ask every one to mail me their pics, or preferably a link to an archive or so. I'll make sure it all ends up in the gallery. &lt;/b&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-1040732150817183043?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/1040732150817183043/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-in-pictures.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1040732150817183043'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1040732150817183043'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/pcg10-in-pictures.html' title='PCG10 in pictures'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-3266163109359866946</id><published>2010-09-25T12:05:00.094+02:00</published><updated>2010-09-30T21:34:19.483+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>Pentaho Community Gathering (Live)</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;It's september 25th , I'm sitting in Cascais, hotel Albatroz, where the Pentaho Community Gathering 2010 is happening. I'll add some stuff 'live' to our blog as presentations happen.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Remarks:&amp;nbsp;&lt;/i&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;i&gt;September 26th, I revisited the post, cleaned up a bit, added the missing video and some slide presentations that came in late. I tried to leave the 'live' feeling though.&lt;/i&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;September 28th, after I sobered up, I fixed some deadlinks (presentations of Roland and André are now correctly linked), added Nuno's presentation, and added some links and thank you's to the organizers of this great event. &lt;/i&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;The agenda is pretty crammed so we'll have to see whether we'll manage to stay on track.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/images/PCG10_program.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="286" src="http://www.kjube.be/images/PCG10_program.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Another thought that passes my mind is whether people will actually be able to refrain from running outside to catch some sun. The view from the meeting room says it all.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ3OBbbQbYI/AAAAAAAAAbg/mLaWVrO-MUI/s1600/IMAG0645.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="301" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ3OBbbQbYI/AAAAAAAAAbg/mLaWVrO-MUI/s400/IMAG0645.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;10h15 - Dough Moran&lt;/b&gt;&lt;br /&gt;&lt;a href="http://www.pentaho.com/team/bio.php?bio=doug_moran"&gt;Doug Moran&lt;/a&gt; kicked off the meeting by presenting every one to one another and thanking &lt;a href="http://webdetails.pt/"&gt;WebDetails&lt;/a&gt; and &lt;a href="http://www.xpand-it.com/"&gt;Xpand-IT&lt;/a&gt; for organizing the event.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.xpand-it.com/images/stories/logo.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://www.xpand-it.com/images/stories/logo.png" /&gt;&lt;/a&gt;&lt;a href="http://webdetails.pt/conteudos/logo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://webdetails.pt/conteudos/logo.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Oh yeah, T-shirts to be distributed later.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ4XusWZNaI/AAAAAAAAAcI/Y55lfTLeK0g/s1600/IMAG0652.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="301" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ4XusWZNaI/AAAAAAAAAcI/Y55lfTLeK0g/s400/IMAG0652.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;With this, Jos van Dongen, industry analyst, informs the world the conference has kicked off.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ8oidpfxAI/AAAAAAAAAc4/WJg9zH3qJ1o/s1600/pic202.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="71" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ8oidpfxAI/AAAAAAAAAc4/WJg9zH3qJ1o/s400/pic202.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;10h30 - Pedro Pinheiro - CDA &amp;nbsp;&lt;/b&gt;[presentation coming up]&lt;br /&gt;Pedro explains CDA, community data access, a server side solution for data access usable for dashboarding and reporting.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/images/IMAG0641.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="300" src="http://www.kjube.be/images/IMAG0641.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande', sans-serif; font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 14px; line-height: 16px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;Since it's a server side solution it's a bit hard to "show" what it "looks like", but more will be revealed during the later presentations (by WebDetails) where dashboarding/reporting tools will use CDA.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ8o6ri6E4I/AAAAAAAAAc8/IFCdkKNfNy4/s1600/pic201.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="57" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ8o6ri6E4I/AAAAAAAAAc8/IFCdkKNfNy4/s400/pic201.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;10h54 - Julian Hyde - Mondrian stuff &amp;nbsp;&lt;/b&gt;[&lt;a href="http://www.kjube.be/presentations/PCG10_JulianHyde.ppt"&gt;presentation&lt;/a&gt;]&lt;br /&gt;Hej, we are ahead of schedule? Can Julian keep it so? I'm not sure as he's calmly taking off with some slides from &lt;a href="http://julianhyde.blogspot.com/2010/08/september-world-tour.html"&gt;previous community meetings&lt;/a&gt; as well as his kid.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_BVv0WTpeWTs/TGsxGm0ZB_I/AAAAAAAAAEk/rFPvY1l7rrM/s1600/P2009918_ji_IMG_0894.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="300" src="http://1.bp.blogspot.com/_BVv0WTpeWTs/TGsxGm0ZB_I/AAAAAAAAAEk/rFPvY1l7rrM/s400/P2009918_ji_IMG_0894.JPG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Anyhow, Mondrian is undergoing a full rewrite. Some of the code goes back 9 yeas no, so rewriting all that involves A LOT of stuff. Currently Julian seems to wonder if the code will ever build again - see his &lt;a href="http://julianhyde.blogspot.com/2010/07/mondrian-heading-for-40.html"&gt;previous blog post&lt;/a&gt; on that - but he's confident he'll get things running (if his youngest doesn't turn off his PC to often).&lt;br /&gt;&lt;br /&gt;So what can we expect?&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Attribute oriented analysis&lt;/li&gt;&lt;li&gt;Physical models&lt;/li&gt;&lt;li&gt;Composite keys&lt;/li&gt;&lt;li&gt;Measure groups: cubes with multiple fact tables eliminating the need for virtual cubes&lt;/li&gt;&lt;li&gt;Improved schema validation&lt;/li&gt;&lt;/ul&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ3KYQ4xHII/AAAAAAAAAbc/S6Ptkoyn2-4/s1600/IMAG0643.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ3KYQ4xHII/AAAAAAAAAbc/S6Ptkoyn2-4/s400/IMAG0643.jpg" width="301" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;As far as transition to Mondrian 4.0 is concerned, Julian says it won't be easy, but Mondrian will remain backwards compatible towards version 3. Workbench will need a rework due to the modifications in Mondrian, that is if Pentaho wants to keep workbench. But there are other options. Agile BI or the Metadata Editor might be extended to serve the purpose. The decision hasn't been made yet.&amp;nbsp;A long beta process is &amp;nbsp;foreseen.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;Short coffee break now, next is Matt Casters&lt;br /&gt;&lt;br /&gt;&lt;b&gt;11h39 - Matt Casters - Dynamic ETL / Metadata Injection &lt;/b&gt;[no presentation, check demo below]&lt;br /&gt;Matt goes over the history of ETL tools has undergone from quick hacks, over frameworks, over code-generators to real data integration engines as we know them. This presentation is about "what is next"?&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ3Rw0cnmkI/AAAAAAAAAbk/lWQK_MQAugU/s1600/IMAG0646.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ3Rw0cnmkI/AAAAAAAAAbk/lWQK_MQAugU/s400/IMAG0646.jpg" width="301" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Matt shows the example of dynamically loading a csv file into a table. In this use case you don't know the .csv file name upfront, neither do you know the field names, data types etc.&amp;nbsp;What the meta data injector does is passing all the right information to your transformation?&lt;br /&gt;&lt;br /&gt;Anyhow, to say it with Matt's words, cut the talk, just show us the demo.&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;object height="385" width="480"&gt;&lt;param name="movie" value="http://www.youtube.com/v/EjzgzOanq1o?fs=1&amp;amp;hl=en_US"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/EjzgzOanq1o?fs=1&amp;amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;In order to enable this kind of meta data injection, a rework of steps is needed, so it'll take some time before this functionality is available throughout PDI. Also, probably some kind of light weight UI will be needed for the design of these dynamic ETL solutions.&lt;br /&gt;&lt;br /&gt;The call to the community is: please provide use cases for dynamic ETL.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;12h00 - Aaron Philips (&lt;/b&gt;&lt;span class="Apple-style-span" style="color: #333333; font-family: 'Lucida Grande', sans-serif; font-size: 14px; line-height: 16px;"&gt;@&lt;a class="tweet-url username" href="http://twitter.com/phytodata" rel="nofollow" style="color: #2276bb; margin: 0px; padding: 0px; text-decoration: underline;"&gt;phytodata&lt;/a&gt;&lt;/span&gt;&lt;b&gt;) - Plug-ins and extension points&amp;nbsp;&lt;/b&gt;[&lt;a href="http://www.kjube.be/presentations/PCG10_AaronPhillips.pdf"&gt;presentation&lt;/a&gt;]&lt;br /&gt;The BI server is becoming a business intelligence oriented application server rather than just a BI solution server. Eg. CDA (presented earlier on) has been developed as a plug-in that runs as an application on the BI server.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ8ltmH7F_I/AAAAAAAAAck/tA2QFitv11E/s1600/pic199.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="57" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ8ltmH7F_I/AAAAAAAAAck/tA2QFitv11E/s400/pic199.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;i&gt;(Aaron's presentation seems extremely well written out, so I guess it'll be self-explanatory when it will be published later on. We'll add links as soon as all presentations are added online.)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ3XsvXksiI/AAAAAAAAAbo/2f45E1k_AhM/s1600/IMAG0649.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ3XsvXksiI/AAAAAAAAAbo/2f45E1k_AhM/s320/IMAG0649.jpg" width="241" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;A very interesting idea presented as an illustration of a BI server extension is an alternative to xactions (Yeah!), being a GroovyEngine plugin for the BIserver. This triggers interesting remarks from the community though. We already have job scheduling mechanism namely PDI, why aren't we using this.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ8l2F_7lZI/AAAAAAAAAco/nJoJ6o7Fb7k/s1600/pic198.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="57" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ8l2F_7lZI/AAAAAAAAAco/nJoJ6o7Fb7k/s400/pic198.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ8mNmS_7pI/AAAAAAAAAcw/4RMJ8QixVVk/s1600/pic197.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="70" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ8mNmS_7pI/AAAAAAAAAcw/4RMJ8QixVVk/s400/pic197.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;a href="http://www.pentaho.com/team/bio.php?bio=doug_moran"&gt;Doug&lt;/a&gt;'s reply to the matter is that both options are open. The platform will offer possibilities for plugins and you can go one way or the other. &lt;a href="http://julianhyde.blogspot.com/"&gt;Julian&lt;/a&gt; wonders why we need two times the same functionality. Seems like the whole discussion evolves around whether Pentaho want to offer a BI server or a BI application server. In the first case Pentaho would offer BI functionality, while in the second case they offer a platform to run BI applications on, even external ones like BIRT reports, Jaspersoft reports, ... &amp;nbsp;Interesting discussions.&lt;br /&gt;&lt;br /&gt;A remarkable fact to add, is that Aaron's presence on PCG10, is on specific request from the community. Some months ago a poll was launched by community members to make sure that attendance of Pentaho developers that have ear for the needs of the community is wanted. The results of the poll were clear: "Ship Aaron to PCG10". Will other Pentaho developers score better next year? Or will Aaron remain the uncrowned community hero? To be continued.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TKSShRYyKCI/AAAAAAAAAdU/zbRPcdvOMIg/s1600/pic213.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TKSShRYyKCI/AAAAAAAAAdU/zbRPcdvOMIg/s400/pic213.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Presentation finished at 12h32, so we are still on our challenging schedule.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;12h30 - Nelson Sousa - CDE (Community Dashboard Editor)&amp;nbsp;&lt;/b&gt;[presentation coming up]&lt;br /&gt;&lt;div&gt;Nelson kicks off wildly - but claims he has done wilder things - with a CDE demo. It shows clearly how you can click together your dashboard (row after column after row after column after row ...), based on CDF components, CDA elements, ...&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ3h3GmR4QI/AAAAAAAAAbw/OG0GSBdnpZY/s1600/IMAG0653.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ3h3GmR4QI/AAAAAAAAAbw/OG0GSBdnpZY/s320/IMAG0653.jpg" width="240" /&gt;&lt;/a&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ3gMFq8CaI/AAAAAAAAAbs/NzuWXaemxyo/s1600/IMAG0651.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ3gMFq8CaI/AAAAAAAAAbs/NzuWXaemxyo/s320/IMAG0651.jpg" width="240" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;(The demo itself is pretty interesting. It's a dashboard showing Tweet statistics.)&lt;br /&gt;The dashboard editor generates .html, .js, .css files which goes into the BI server.&lt;br /&gt;&lt;br /&gt;For more information on CDE:&amp;nbsp;&lt;a href="http://webdetails.pt/"&gt;http://webdetails.pt/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;Lunch break&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ8mabN3kkI/AAAAAAAAAc0/KBGKnun2h8s/s1600/pic196.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="55" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ8mabN3kkI/AAAAAAAAAc0/KBGKnun2h8s/s400/pic196.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;While all presenters have been respecting the time table, it seems that most of the community couldn't resist to stay out a bit longer for lunch. So in the end we picked up the agenda with a half hour delay.&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;div style="margin: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;14h30 - Tom Barber and Paul Stoellberger &amp;nbsp;- PAT (Pentaho Analysis Tool) &lt;/b&gt;[&lt;a href="http://bit.ly/b9OIZ7"&gt;presentation&lt;/a&gt; / &lt;a href="http://www.wamonline.org.uk/tomsamazingcommunitybiserverpresentation-1.odp"&gt;presentation&lt;/a&gt;]&lt;/div&gt;&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;&lt;br /&gt;Paul Stoellberger kicked off PAT presentation with two slides and dived immediately into the demo part, demo-ing all slice &amp;amp; dice, drill down/across, filter etc functionalities of PAT, both in stand-alone mode or as part of the Pentaho BI server.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ39Z24g-iI/AAAAAAAAAb4/87krKXWQVzE/s1600/IMAG0654.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="320" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ39Z24g-iI/AAAAAAAAAb4/87krKXWQVzE/s320/IMAG0654.jpg" width="240" /&gt;&lt;/a&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ39TS-UOgI/AAAAAAAAAb0/lchTD0gmy3c/s1600/IMAG0657.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ39TS-UOgI/AAAAAAAAAb0/lchTD0gmy3c/s320/IMAG0657.jpg" width="240" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;For the moment Paul and Tom aren't adding new features because they want to focus on getting a stable 1.0 out there. Obviously there are some interesting ideas for the PAT future as predictive analytics (including WEKA) or adding new charting options (using protovis). But for now feedback from the community on stability and bug reporting are the highest on the request list.&lt;br /&gt;&lt;br /&gt;After Paul concluded, Tom presented PAT ideas on modular and collaborative BI with OSGI. The original idea was to work around collaborative BI only, but the ideas have expanded, ... and remain mostly only ideas for now. However the baseline idea is that currently Pentaho doesn't support collaboration in any way. The CDF has the possibility to insert some comments, and of course you can mail report links etc, but that is about where it ends. So the idea is to build this in using OSGI, a module system for Java allowing you to install new modules without stopping or rebooting the server.&amp;nbsp;Next thing Tom starts of a demo on some of the basic features of making 'PAT RESTfull'.&lt;br /&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;14h30 - Jan Aertsen and Matt Casters - KFF (Kettle Kitchen Factory) &lt;/b&gt;[&lt;a href="http://www.kjube.be/presentations/PCG10_JanAertsen.pdf"&gt;presentation&lt;/a&gt;]&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ4PZpWfDrI/AAAAAAAAAb8/c_-qkiW7Svc/s1600/pic191.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="57" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ4PZpWfDrI/AAAAAAAAAb8/c_-qkiW7Svc/s400/pic191.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;I agree Julian, it's hard to blog your own presentation!&lt;br /&gt;(&lt;a href="http://kjube.blogspot.com/2010/09/pcg10-kff-presentation.html"&gt;30/09 but I finally added the full walk-through here&lt;/a&gt;)&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;15h00 - Nuno Moreira - Pentaho Dashboards, breaking barriers&amp;nbsp;&lt;/b&gt;[&lt;a href="http://www.kjube.be/presentations/PCG10_NunoMoreira.pdf"&gt;presentation&lt;/a&gt;]&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;Nuno shows a visually stunning presentation (no images as below text might suggest though) and is asking us to think of dashboards as:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;a sexy stripper&lt;/li&gt;&lt;li&gt;a sexy stripper doing a lap dance&lt;/li&gt;&lt;li&gt;a sexy stripper doing a lap dance which you can disassemble (?) [not my words]&lt;/li&gt;&lt;li&gt;a sexy stripper doing a lap dance which you can disassemble and talk to also.&amp;nbsp;&lt;/li&gt;&lt;li&gt;a sexy stripper doing a lap dance&amp;nbsp;which you can disassemble and talk to also, and who allows you to squeeze her&lt;/li&gt;&lt;li&gt;a sexy stripper doing a lap dance&amp;nbsp;which you can disassemble and talk to also, and who allows you to squeeze her and who doesn't mind you share her with your friends.&lt;/li&gt;&lt;/ol&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ4Wak7SyPI/AAAAAAAAAcA/RAF_HymqRg8/s1600/IMAG0658.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ4Wak7SyPI/AAAAAAAAAcA/RAF_HymqRg8/s320/IMAG0658.jpg" width="241" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ8mGhd2oKI/AAAAAAAAAcs/DgyNy_o9Z1E/s1600/pic195.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="58" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ8mGhd2oKI/AAAAAAAAAcs/DgyNy_o9Z1E/s400/pic195.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;No need to say that this way of looking at dashboards really captured the attention of the Pentaho crowd. Obviously Nuno was using a&amp;nbsp;metaphor to&amp;nbsp;talk about different levels of customizability of and interaction with dashboards.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Short coffee break&lt;/b&gt;&lt;br /&gt;... with wonderful cup cakes: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ... made by the organizer of the whole event&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;b&gt;(A big big thank you for that!)&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ42OeJvGfI/AAAAAAAAAcY/svPBsyuc7Wc/s1600/IMAG0662.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="320" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ42OeJvGfI/AAAAAAAAAcY/svPBsyuc7Wc/s320/IMAG0662.jpg" width="241" /&gt;&lt;/a&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ4qhOSX6cI/AAAAAAAAAcM/PV2ve2lQBNc/s1600/IMAG0660.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ4qhOSX6cI/AAAAAAAAAcM/PV2ve2lQBNc/s320/IMAG0660.jpg" width="241" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;15h30 - Jos van Dongen (aka Jos von Dongen, aka Grumpy) - Data mining &lt;/b&gt;[&lt;a href="http://www.kjube.be/presentations/PCG10_JosvanDongen.pdf"&gt;presentation&lt;/a&gt;]&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;"Is data mining the newest piece of shit? ... I don't think so.", says Jos, and kicks of his presentation.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ4XckfymoI/AAAAAAAAAcE/57dBRyqEyVw/s1600/IMAG0659.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ4XckfymoI/AAAAAAAAAcE/57dBRyqEyVw/s320/IMAG0659.jpg" width="241" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Jos walks us through the different data mining tools and techniques: decision trees, neural networks, regression analysis, ..., explains the differences between supervised vs unsupervised learning, splitting your data sets, etc.&lt;br /&gt;&lt;br /&gt;He worked out the examples in Weka / Kettle and showed it a shortlive demo.&amp;nbsp;Personally I believe that the Weka / Kettle integration is an extremely powerful feature (which many commercial data mining / ETL tools) don't even offer today. I really like the demo and hope to start using this type of functionality soon.&lt;br /&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;16h00 - Dan [Codek] - Approaches to implementations and methodology &lt;/b&gt;[no presentation]&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;Dan is trying to get us back on schedule by keeping his presentation short. No slides prepared, just a short series of ideas sketched on paper as a guideline for his talk. Since he moved from consulting to being responsible for business intelligence in a real company, he's interested on how people manage their projects.&lt;br /&gt;&lt;br /&gt;Basically his talk started with the question who is using SCRUM and ... I filmed the rest.&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;object height="385" width="480"&gt;&lt;param name="movie" value="http://www.youtube.com/v/1tyQNAcQYgo?fs=1&amp;amp;hl=en_US"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/1tyQNAcQYgo?fs=1&amp;amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;16h30 - André Simões - PDI job/transformation framework&amp;nbsp;&lt;/b&gt;[&lt;u&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_AndreSimoes.pdf"&gt;presentation&lt;/a&gt;&lt;/u&gt;]&lt;/div&gt;&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;André Simões, aka ITXpander, aka 'The useless guy on IRC', talks about an ETL framework including ETL chaining, ETL scheduling, building in check points and making self-contained ETL processes to ensure restartability, etc. Great stuff. &lt;b&gt;A merger between this and KFF has been decided on the spot. A clear indication that there is a need for this kind of utilities.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ5BncGfpmI/AAAAAAAAAcg/x1yVzuRarZQ/s1600/IMAG0673.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ5BncGfpmI/AAAAAAAAAcg/x1yVzuRarZQ/s320/IMAG0673.jpg" width="241" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Another addition to the presentation: Pentaho Reporting in Confluence.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;(Sorry for the limited notes, as this was close to KFF I was to interested in understanding the presentation.)&lt;/span&gt;&lt;/i&gt;&lt;br /&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;In the meantime the heavy schedule starts to weigh on the participants. Having a sunny beach right outside the room doesn't make it easier.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ8qKb4aQKI/AAAAAAAAAdA/tifKKXHw97c/s1600/pic200.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="70" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ8qKb4aQKI/AAAAAAAAAdA/tifKKXHw97c/s400/pic200.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;But the final list of presenters are know for being able to keep their presentation juicy and spicy, so no doubt the audience will remain present.&lt;br /&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;17h00 - Pedro Alves - CCC (Community Charting Components) &lt;/b&gt;[&lt;a href="http://www.kjube.be/presentations/PCG10_PedroAlves.pdf"&gt;presentation&lt;/a&gt;]&lt;/div&gt;&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;Pedro explored 20 charting libraries to see which one was the best to add to Pentaho as the existing charting is crap. He toyed with the idea to write a charting metadata layer allowing to plug-in all existing charting layers, but that idea was quickly tossed aside as it would add to much layers of complexity.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ40EeEiIfI/AAAAAAAAAcU/n8QzJLZnbpo/s1600/IMAG0661.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ40EeEiIfI/AAAAAAAAAcU/n8QzJLZnbpo/s320/IMAG0661.jpg" width="241" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;So he backed out and thought about what users want. Users don't care about the library you use for charting, they just want that you can create the visualizations they need. So he looked for a visualization library rather than a charting library, being &lt;a href="http://vis.stanford.edu/protovis/"&gt;protovis&lt;/a&gt;.&amp;nbsp;On top of this, Pedro started developing CCC (Community Charting Components), a charting library based on protovis. This allows you to always go back to the visualization library and make/adjust your chart as you want.&lt;br /&gt;&lt;br /&gt;Next Pedro did a demo on how the CCC fit in with CDA, CDF and CDE.&lt;br /&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;Group picture session&lt;/b&gt;&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://iloapp.kjube.be/gallery/pictures?Download&amp;amp;album=2&amp;amp;image=41" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://iloapp.kjube.be/gallery/pictures?Download&amp;amp;album=2&amp;amp;image=41" width="580" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;a href="http://kjube.blogspot.com/2010/09/pcg10-participants.html"&gt;who's who&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;div style="margin: 0px;"&gt;&lt;b&gt;17h30 - Roland Bouman - Kettle Cookbook&lt;/b&gt; [&lt;u&gt;&lt;a href="http://www.kjube.be/presentations/PCG10_RolandBouman.pdf"&gt;presentation&lt;/a&gt;&lt;/u&gt;]&lt;/div&gt;&lt;/div&gt;&lt;div style="margin: 0px;"&gt;&lt;/div&gt;&lt;a href="http://rpbouman.blogspot.com/"&gt;Roland&lt;/a&gt; elaborates on dominant users, positive eating experiences, having the guts, communism and Mao's manual which brings him straight to the &lt;a href="http://code.google.com/p/kettle-cookbook/"&gt;kettle-cookbook&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ4ysmqmC0I/AAAAAAAAAcQ/k3-b0VQ4CGE/s1600/IMAG0664.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ4ysmqmC0I/AAAAAAAAAcQ/k3-b0VQ4CGE/s320/IMAG0664.jpg" width="241" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Since Roland is a huge fan of the RTFM&amp;nbsp;theorem, he considered it was time to ensure that based on the ETL code - actually the ultimate documentation, why can't the users read that ...- it was time to automatically create documentation, because who actually wants to write documentation, it's even more boring than reading it.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ8q-NT4YoI/AAAAAAAAAdE/rmH8qeX5n48/s1600/pic203.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="58" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ8q-NT4YoI/AAAAAAAAAdE/rmH8qeX5n48/s400/pic203.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The kettle-cookbook auto documentation tool is developed in kettle under LGPL license. It will scan a directory of kettle code (.ktr / .kjb) and will generate cross-linked.html pages with a TOC, including diagrams and an overview of all variables, connections, fields, ...&lt;br /&gt;&lt;br /&gt;&lt;b&gt;18h00 - Jens Bleuel - Concept and realization for a PDI watchdog&lt;/b&gt;&amp;nbsp;[presentation]&lt;br /&gt;Jens has the hard task to actually still kick some life into a crowd that has been hit with tons of slides and demos throughout the day.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ479fczyuI/AAAAAAAAAcc/4-_QhbiI6nY/s1600/IMAG0672.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TJ479fczyuI/AAAAAAAAAcc/4-_QhbiI6nY/s320/IMAG0672.jpg" width="241" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Basically the watchdog checks whether .ktr/.kjb are alive. Jens walked us through the code he wrote for this.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="color: #4c1130;"&gt;... and this concludes the presentations. &amp;nbsp;Over and out. I'm wasted of blogging all day.&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="color: #4c1130;"&gt;Maybe I'll add some more pictures later about the evening part of the event.&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ8rSMjg6zI/AAAAAAAAAdI/V-6sI_ThgYU/s1600/pic204.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="58" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ8rSMjg6zI/AAAAAAAAAdI/V-6sI_ThgYU/s400/pic204.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ8rdi4qjXI/AAAAAAAAAdM/FHW_imkhBOE/s1600/pic192.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="70" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TJ8rdi4qjXI/AAAAAAAAAdM/FHW_imkhBOE/s400/pic192.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ8uouApOsI/AAAAAAAAAdQ/HeE701MpyZg/s1600/pic205.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="70" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJ8uouApOsI/AAAAAAAAAdQ/HeE701MpyZg/s400/pic205.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;I heard that quite a few people are actually reading the blog post as it grows&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&amp;nbsp;... which I didn't really expect (honestly). But while you guys are at it, please&lt;/b&gt;&lt;br /&gt;&lt;b&gt;leave a few comments on this way of 'reporting' on the event&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;(so we can continue&amp;nbsp;to improve). Thank you !!!&lt;/b&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-3266163109359866946?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/3266163109359866946/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/pentaho-community-gathering-live.html#comment-form' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3266163109359866946'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3266163109359866946'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/pentaho-community-gathering-live.html' title='Pentaho Community Gathering (Live)'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/TJ3OBbbQbYI/AAAAAAAAAbg/mLaWVrO-MUI/s72-c/IMAG0645.jpg' height='72' width='72'/><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-321096479468910293</id><published>2010-09-21T15:53:00.001+02:00</published><updated>2010-09-30T21:22:02.584+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - KFF'/><title type='text'>KFF - (slowly) coming out of the kitchen closet</title><content type='html'>Last week we release a new version of kff-plugins.jar. This contains the following plug-ins:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJi2hjnJYjI/AAAAAAAAAbU/4OwbHeVSEs4/s1600/pic189.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="327" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJi2hjnJYjI/AAAAAAAAAbU/4OwbHeVSEs4/s400/pic189.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In short:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;b&gt;Data grid&lt;/b&gt;: a plug-in that allows you to enter data in an spreadsheet like grid which you then can use in your transformations. Very handy for ETL configuration data which you would have to put in a table or csv otherwise. &lt;b&gt;&lt;a href="http://kjube.blogspot.com/2010/08/data-grid-step.html"&gt;See this blogpost for more info.&lt;/a&gt;&lt;/b&gt;&lt;/li&gt;&lt;li&gt;&lt;b&gt;Date/time calculator&lt;/b&gt;: It assembles a date/time field based on 7 input fields being: century, year, month, date, hour, minutes, seconds. Very useful when reading in AS400 tables created in the context of RPG programming. &lt;a href="http://kjube.blogspot.com/2010/07/converting-as400-rpg-dates-using-kettle.html"&gt;&lt;b&gt;More information here.&lt;/b&gt;&lt;/a&gt;&amp;nbsp;&amp;nbsp;The Season ID calculator does something similar.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Rejects&lt;/b&gt;: &amp;nbsp;A generic component for error handling. It will converge all error records into one common format so all your rejects fit in one and the same output file or table. We'll elaborate on this one soon in an extra blogpost. Documentation will be added to the &lt;b&gt;&lt;a href="http://code.kjube.be/"&gt;KFF pages&lt;/a&gt;&lt;/b&gt;.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Table compare&lt;/b&gt;: This thing does what it says. It compares the data from two tables with the same lay-out. It'll find differences between the data in the two tables and log it. Very handy for acceptance tests. Again, this deserves a separate blog post. Probably we'll make a project template for this one too.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Trim strings&lt;/b&gt;: Allows you to trim strings, all fields at once, or a field list at your choosing. A bit more complex and versatile than the standard trim functionality in kettle.&lt;/li&gt;&lt;li&gt;&lt;b&gt;kJube decoder&lt;/b&gt;: A piece of machinery that needs 20 pages to describe. Until we get to that, I won't try here.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;We hope to get documentation online an we'll provide some blog posts to draw your attention to the new functionalities.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Enjoy !&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-321096479468910293?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/321096479468910293/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/kff-slowly-coming-out-of-kitchen-closet.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/321096479468910293'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/321096479468910293'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/kff-slowly-coming-out-of-kitchen-closet.html' title='KFF - (slowly) coming out of the kitchen closet'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_C0PnWJwDRZY/TJi2hjnJYjI/AAAAAAAAAbU/4OwbHeVSEs4/s72-c/pic189.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8958179965988800634</id><published>2010-09-19T13:05:00.001+02:00</published><updated>2010-09-30T21:22:14.352+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>Vista</title><content type='html'>I just found back an old picture from a visit to MediaMarkt, the local electronics retailer.&amp;nbsp;It dates from the release of Vista. For some reason Microsoft wanted to publicize the release of Vista on all the lockers at the entrance of the shop.&lt;br /&gt;&lt;br /&gt;I'm not sure if there is some Mac or Linux fan working there, but somehow this publicity kind of turned out wrong.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJXuRuhMOpI/AAAAAAAAAbM/098-L3Tpv5o/s1600/20080806+Mediamarket+about+Vista+001.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="300" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TJXuRuhMOpI/AAAAAAAAAbM/098-L3Tpv5o/s400/20080806+Mediamarket+about+Vista+001.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8958179965988800634?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8958179965988800634/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/vista.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8958179965988800634'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8958179965988800634'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/vista.html' title='Vista'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_C0PnWJwDRZY/TJXuRuhMOpI/AAAAAAAAAbM/098-L3Tpv5o/s72-c/20080806+Mediamarket+about+Vista+001.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8739334453038912159</id><published>2010-09-13T12:15:00.002+02:00</published><updated>2010-09-30T21:22:30.726+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Modelling'/><title type='text'>ER models and the EAV model</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: left;"&gt;The last two years&amp;nbsp;I've been involved in setting up&amp;nbsp;a (generic) data model for an insurance company. The data model covers (or rather should cover) customers, products, due payments and benefits, financial reserves, actual payments (in/out), etc ...&amp;nbsp;for&amp;nbsp;all possible&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Life_insurance"&gt;life insurance products&lt;/a&gt; which are maintained in several different systems. This model is intended to be 'hub' or &lt;a href="http://www.inmoncif.com/library/cif/"&gt;global operational data store&lt;/a&gt; (as defined by &lt;a href="http://www.inmoncif.com/home/"&gt;Inmon&lt;/a&gt;&amp;nbsp;in "The Corporate Information Factory").&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;The challenge lies in the 'all possible'. This insurer does a lot of &lt;a href="http://en.wikipedia.org/wiki/Group_insurance"&gt;group insurances&lt;/a&gt;. Due to the fact that most companies these days do 'shop around' for their group insurances these policies tend to be highly customized. So much that the systems in which they administer these insurance policies are often modified to include one more insurance policy. Or in other words, any new customer may lead to a new or changed product, including a change in the data model of the operational system. Often the capability of adapting the operational system is key to getting the new customer. That may seem strange, but if you know that these customers are companies with 1000's of employees with significant life insurance plans, well it makes sense from a cost-benefit point of view.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;And that is exactly where the data modelling challenge start. How do you model a product that is nearly endlessly customizable? Before you know it, you end up with a &lt;a href="http://www.databaseanswers.org/data_models/father_of_all_models/index.htm"&gt;entity-attribute-value data model&lt;/a&gt; that looks like this. (Original:&amp;nbsp;&lt;a href="http://www.databaseanswers.org/data_models/father_of_all_models/index.htm"&gt;Barry Williams, databaseanswers.org&lt;/a&gt;)&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TI3q04C8mQI/AAAAAAAAAaA/sSqyWfH67w8/s1600/father_of_all_data_models_with_data.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" ox="true" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TI3q04C8mQI/AAAAAAAAAaA/sSqyWfH67w8/s400/father_of_all_data_models_with_data.gif" width="373" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;However, the entity-attribute-value data model is intended for sparse attributes only. Obviously. If this model would work outside of that scope, every one would use it for any application. More generic is hardly possible. Or did I hear some one suggesting "meta-thing" as an entity? &amp;nbsp;:-)&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Anyhow, to &amp;nbsp;model the correct solution, the first step is to to analyse the sparsity of entities/attributes and then deciding what data goes into the standard ER-model and what part of the data goes into the EAV-model. The next step however is to see how you can set-up a&amp;nbsp;successful&amp;nbsp;marriage between your ER and EAV model. I found some good information on how to handle this for &lt;a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2110957/"&gt;bio-medial databases&lt;/a&gt;, which helps to understand&amp;nbsp;strengths/weakness of both models and how to combine them. Building on that I kind of sifted through all attributes and decided whether to put them in my ER-model or in my EAV-kind-of-model, and that seems to work out so far.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Next is the question on how to report on this, as I fear for &lt;a href="http://www.bi-verdict.com/fileadmin/FreeAnalyses/DatabaseExplosion.htm"&gt;database explosion&lt;/a&gt; if I turn this kind of data into a &lt;a href="http://www.stanford.edu/dept/itss/docs/oracle/10g/olap.101/b10333/multimodel.htm"&gt;multi-dimensional model&lt;/a&gt;. I'll post my findings ... when I find them.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8739334453038912159?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8739334453038912159/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/father-of-all-data-models.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8739334453038912159'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8739334453038912159'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/father-of-all-data-models.html' title='ER models and the EAV model'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/TI3q04C8mQI/AAAAAAAAAaA/sSqyWfH67w8/s72-c/father_of_all_data_models_with_data.gif' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-171191058393379115</id><published>2010-09-03T12:00:00.001+02:00</published><updated>2010-09-30T21:22:41.852+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>kettle history lesson</title><content type='html'>A VERY TRUE tale about the real origin of kettle a.k.a. Pentaho Data Integration.&lt;br /&gt;&lt;div style="margin: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin: 0px;"&gt;It is the year 2000. Most of the IT world is wondering what to do after the &lt;a href="http://en.wikipedia.org/wiki/Year_2000_problem"&gt;&lt;b&gt;Y2K bug&lt;/b&gt;&lt;/a&gt; (except for counting the $$ wasted or earned). In Europe, the CEE reacts by introducing '&lt;a href="http://en.wikipedia.org/wiki/Euro"&gt;&lt;b&gt;the euro&lt;/b&gt;&lt;/a&gt;'. The switch of national currencies to unified Euro currency generates again work for the whole of the IT sector. In the US &lt;a href="http://en.wikipedia.org/wiki/Bush_v._Gore"&gt;v&lt;b&gt;otes are counted and recounted&lt;/b&gt;&lt;/a&gt; to ensure that &lt;a href="http://www.whitehouse.gov/about/presidents/georgewbush"&gt;&lt;b&gt;Bush Jr can become president&lt;/b&gt;&lt;/a&gt;. &lt;a href="http://en.wikipedia.org/wiki/Slobodan_Milo%C3%85%C2%A1evi%C3%84%E2%80%A1"&gt;&lt;b&gt;Slobodan Milosovic&lt;/b&gt;&lt;/a&gt; on the other hand disappears from the stage. In Russia &lt;a href="http://www.vladimirputin.com/"&gt;&lt;strong&gt;Vladimir Putin&lt;/strong&gt;&lt;/a&gt; is the new man.&amp;nbsp;Hopes grow that maybe the &lt;a href="http://www.ornl.gov/sci/techresources/Human_Genome/project/clinton1.shtml"&gt;human genome mapping&lt;/a&gt; will explain why these people actually come to power?&amp;nbsp;&lt;a href="http://uncyclopedia.wikia.com/wiki/Microsoft"&gt;&lt;b&gt;Microsoft&lt;/b&gt;&lt;/a&gt;&amp;nbsp;is judged a monopoly by District Judge &lt;b&gt;&lt;a href="http://en.wikipedia.org/wiki/Thomas_Penfield_Jackson"&gt;Thomas Penfield Jackson&lt;/a&gt;; &lt;/b&gt;&amp;nbsp;he orders &amp;nbsp;the company to be split up. (&lt;a href="http://twitter.com/billgates"&gt;&lt;strong&gt;Bill Gates&lt;/strong&gt;&lt;/a&gt; and Microsoft will go to the supreme court). In the mean time Bill starts touting the idea that &lt;a href="http://www.inquisitr.com/14906/windows-powered-coffee-maker-dont-laugh-its-real/"&gt;&lt;b&gt;your kitchen will become Microsoft powered&lt;/b&gt;&lt;/a&gt;. In Belgium &lt;a href="http://www.ibridge.be/?page_id=2"&gt;&lt;b&gt;a business intelligence specialist&lt;/b&gt;&lt;/a&gt;&amp;nbsp;wants to snatch that idea away from Bill and starts to hack in his basement in an attempt to out-code Microsoft. In less than no time, &lt;strong&gt;&lt;u&gt;he launches a series of products named kitchen, pan, spoon, kettle, menu, waiter&amp;nbsp;... that will shake the world.&lt;/u&gt;&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Yes, &lt;a href="http://kettle.pentaho.org/"&gt;&lt;strong&gt;kettle&lt;/strong&gt;&lt;/a&gt;, now know as &lt;a href="http://kettle.pentaho.com/"&gt;&lt;strong&gt;Pentaho Data Integration&lt;/strong&gt;&lt;/a&gt;, was originally intended as a kitchen automation system. Or did any one actually believe that those kitchen terms were really intended as a metaphor for data integration technology? Come on, did you seriously buy that story? No no no, the real goal of kettle 0.1.3 - the first version I came across - was to be able to code &lt;a href="http://www.cdkitchen.com/recipes/"&gt;&lt;strong&gt;kitchen recipes&lt;/strong&gt;&lt;/a&gt; using spoon, and have kitchen execute them. A full automated kitchen process, the perfect support for the modern working woman!&lt;br /&gt;&lt;br /&gt;For those who find this hard to believe, I have found an original .ktr from that period, know as the cake mix! This shows clearly what the original intentions for kettle were. The screenshot of the .ktr is (as usually is the case with a kettle transformation) completely self-explanatory.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TG7bNuyb6CI/AAAAAAAAAZY/duhfn5VzhxI/s1600/pic157.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="412" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TG7bNuyb6CI/AAAAAAAAAZY/duhfn5VzhxI/s640/pic157.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I've opened it using kettle 3.2.3. As amazing as it may seem, backward compatibility for PDI seems to go all the way back to that version 0.1.3. I even tried to add a piece of functionality in the end to invite the &lt;a href="http://pedroalves-bi.blogspot.com/2010/08/hey-pentaho-community-how-are-you-doing.html"&gt;&lt;b&gt;&lt;i&gt;very active&lt;/i&gt;&lt;/b&gt;&lt;/a&gt; &lt;a href="http://community.pentaho.com/"&gt;&lt;b&gt;Pentaho Community&lt;/b&gt;&lt;/a&gt; Members to the cake party. It integrates perfectly. &lt;br /&gt;&lt;br /&gt;An interesting aspect, to which I want&amp;nbsp;do draw your attention&amp;nbsp;is the KQL, Kitchen Query Language.&amp;nbsp;I had completely forgotten about this feature until I rediscovered this piece of code. KQL was a kind of GIS implemenation on house level, or rather on kitchen level, allowing to specify every shelf in your kitchen closets, fridge and yes even basement. Ladies and gentlemen, we are (or were) talking 3 dimensional Kitchen GIS (BTW, project name was KIS) here and that more than&amp;nbsp;10 years ago. This query language actually unlocked all physical objects in your kitchen through a syntax very similar to SQL. Example screenshot below.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/THTrJ0qHhKI/AAAAAAAAAZ4/PvrKVLcXrq8/s1600/pic170.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="372" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/THTrJ0qHhKI/AAAAAAAAAZ4/PvrKVLcXrq8/s400/pic170.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In retrospect, I don't know what most readers think about this, but I find the whole set-up of this project mind-blowing and extremely far ahead of its time. But unfortunately the project was aborted ...&amp;nbsp; &lt;br /&gt;&lt;br /&gt;So what happened to the original plans for kettle? Well seemingly there were a few issues with the business plan for kitchen automation, which made our &lt;strong&gt;&lt;a href="http://www.ibridge.be/?page_id=2"&gt;business intelligence specialist&lt;/a&gt;&lt;/strong&gt; rapidly change his plans. (For your information, Microsoft is also slowly discovering that &lt;a href="http://msftkitchen.com/"&gt;&lt;strong&gt;MS in every kitchen machine is not turning out a huge success&lt;/strong&gt;&lt;/a&gt;.)&lt;br /&gt;&lt;ul&gt;&lt;li&gt;One major issue with the concept was that by automating kitchen tasks you free up time for the people who do the kitchen work. No statistics are needed to show that (even until today) it is mostly women that do the kitchen work. A consequence of all this free time, was that more time was spent on shopping. This made the investment in kettle - kitchen automation system actually a very expensive adventure instead of an intended saving. Once&amp;nbsp;male investors understood this negative (perverse) effect of the kettle kitchen automation on their household income they massively abandoned the project. &lt;/li&gt;&lt;li&gt;&lt;a href="http://www.enotes.com/how-products-encyclopedia/refrigerator"&gt;&lt;strong&gt;Fridge APIs&lt;/strong&gt;&lt;/a&gt; turned out to be a true headache. Most of the electronics producers were investing a lot in the reduction of &lt;a href="http://ukqna.com/health/2912-health-ukqna.html"&gt;&lt;strong&gt;CFC&lt;/strong&gt;&lt;/a&gt; emission, amongs others through intelligence leak detection systems.&amp;nbsp;Because of the competitive edge&amp;nbsp;such systems could give them, most producers were not willing to open up access to the internals of their systems.&amp;nbsp;&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;What happened finally is history. &lt;a href="http://www.ibridge.be/?page_id=2"&gt;&lt;strong&gt;Our guy&lt;/strong&gt;&lt;/a&gt; re-applied his coding to the field he was most familiar with, namely &lt;a href="http://www.kjube.be/"&gt;&lt;strong&gt;business intelligence and data integration&lt;/strong&gt;&lt;/a&gt;. He open sourced his efforts and &lt;a href="http://www.pentaho.com/"&gt;&lt;strong&gt;Pentaho&lt;/strong&gt;&lt;/a&gt; recognized his genius. Today every data integration specialist is moving&amp;nbsp;over from &lt;a href="http://www-01.ibm.com/software/data/infosphere/datastage/"&gt;&lt;strong&gt;Datastage&lt;/strong&gt;&lt;/a&gt;, &lt;a href="http://www.informatica.com/Pages/index.aspx"&gt;&lt;strong&gt;Informatica&lt;/strong&gt;&lt;/a&gt;, &lt;a href="http://www.sap.com/solutions/sapbusinessobjects/information-management/data-integration/dataintegrator/index.epx"&gt;&lt;strong&gt;BODI&lt;/strong&gt;&lt;/a&gt;, &lt;a href="http://www.oracle.com/technetwork/developer-tools/warehouse/overview/index.html"&gt;&lt;strong&gt;OWB&lt;/strong&gt;&lt;/a&gt;, ... to Pentaho Data Integration.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-171191058393379115?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/171191058393379115/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/09/kettle-history-lesson.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/171191058393379115'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/171191058393379115'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/09/kettle-history-lesson.html' title='kettle history lesson'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_C0PnWJwDRZY/TG7bNuyb6CI/AAAAAAAAAZY/duhfn5VzhxI/s72-c/pic157.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-712435348058929337</id><published>2010-08-25T11:32:00.001+02:00</published><updated>2010-09-30T21:22:54.270+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - KFF'/><title type='text'>KFF environment configurator</title><content type='html'>&lt;strong&gt;While running the (parametrized) data integration code on different customers/environments, you need to carefully keep track&amp;nbsp;of which kettle.properties file you are using. The environment configurator&amp;nbsp;solves that&amp;nbsp;problem .&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;strong&gt;About kettle variables&lt;/strong&gt;&lt;/u&gt;&lt;br /&gt;The &lt;strong&gt;kettle.properties&lt;/strong&gt; file is well know to all kettle-developers. For those of you who don't fit into that category (and for some strange reason still read this post): the file contains all environment variables that you want to use in jobs/transformations you execute through&amp;nbsp;either spoon or kitchen. The file sits in your home directory, that is ${HOME}/.kettle and you can add whatever variables you want.&lt;br /&gt;&lt;br /&gt;How would you typically use these environment variables? A simple example will illustrate this. &lt;br /&gt;&lt;br /&gt;Suppose you use a standard directory for incoming flat file data you need to process.&amp;nbsp;In that case you can create a variable in the kettle.properties file for this directory.&lt;br /&gt;&lt;div&gt;&lt;blockquote&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;IN_FILE_DIR=/kff/projects/my_customer/my_application/data/in/&lt;/span&gt;&lt;/blockquote&gt;&lt;/div&gt;The variable would then be available in all kettle jobs and transformations you will develop. In this case for instance in the &lt;a href="http://wiki.pentaho.com/display/EAI/Text+File+Input"&gt;&lt;b&gt;Text file input&lt;/b&gt;&lt;/a&gt;&amp;nbsp;step.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/THTg2BYvmfI/AAAAAAAAAZw/z56swCtvwUs/s1600/pic169.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="80" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/THTg2BYvmfI/AAAAAAAAAZw/z56swCtvwUs/s400/pic169.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Now if you are running projects for multiple customers, or you are running the same code on different environments (as in development, unit test, acceptance test, production ...) the input directory for your flat files will most likely be different. In order to keep your code generic you would obviously continue to work with the same variable IN_FILE_DIR, but there is only space for 1 kettle.properties file on your system.&lt;br /&gt;&lt;br /&gt;Some might say that you can solve this issue by using &lt;a href="http://wiki.pentaho.com/display/EAI/Named+Parameters"&gt;&lt;b&gt;named parameters&lt;/b&gt;&lt;/a&gt; which you can pass from the kitchen command line to your jobs/transformations. However if the amount of variables starts to grow, that becomes quite cumbersome to manage.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;u&gt;Environment configurator&lt;/u&gt;&lt;/strong&gt;&lt;br /&gt;In order to avoid manual switching between kettle.properties files&amp;nbsp;for the above scenarios&amp;nbsp;- as I assume many of you are doing - we implemented a little step which we have called the environment configurator. &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/images/pic164.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="141" src="http://www.kjube.be/images/pic164.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Basically this step will read a kettle.properties file (which you can place in the location of your choice). It will&amp;nbsp;read the .properties file, parse it and set the appropriate environment variables. Any values that are located in your $HOME/.kettle/kettle.properties will be overwritten. &lt;br /&gt;&lt;br /&gt;Now if you insert the environment configurator as the first step in your data integration job, as we've done in the KFF-, you can control perfectly which variables will be used to execute your code.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.kjube.be/images/pic165.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://www.kjube.be/images/pic165.png" width="283" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Basically the environment configurator step does expect three input variables which are used throughout the &lt;a href="http://kff.kjube.be/"&gt;kettle franchising factory,&lt;/a&gt; our framework for management and&amp;nbsp;rapid deployment of (multiple) data integration solutions across customers and lifecycles (see more on &lt;a href="http://code.kjube.be/"&gt;Google Code&lt;/a&gt; about this framework)&lt;br /&gt;&lt;blockquote&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;KFF_CUSTOMER&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; The customer for which you are runing&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;KFF_APPLICATION&amp;nbsp;&amp;nbsp; The application code you are running&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;KFF_LIFECYCLE&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; The environment (DEV, TST, UAT, PRD) you are runnnig&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&amp;nbsp;or you can specify the path to your kettle.properties file. &amp;nbsp;(Full details in the &lt;a href="http://code.google.com/p/kettle-franchise/wiki/PluginEnvironmentConfigurator?ts=1282727733&amp;amp;updated=PluginEnvironmentConfigurator#whb"&gt;&lt;b&gt;documentation&lt;/b&gt;&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;u&gt;Vision&lt;/u&gt;&lt;/strong&gt;&lt;br /&gt;For the moment we've released this kettle plugin as a quick hack for our own projects, to simplify our multi-customer/multi-environment set-up.&amp;nbsp;However&amp;nbsp;a lot of improvements and extension are&amp;nbsp;imaginable.&amp;nbsp;We would love to get the discussion going.&lt;br /&gt;&lt;br /&gt;Some food for thought:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;How many environments do you need? And what naming conventions could we use? Out of the top of my head I can think of the following, but we've only "implemented" the first 4.&lt;/li&gt;&lt;ul&gt;&lt;li&gt;DEV: development&lt;/li&gt;&lt;li&gt;TST:&amp;nbsp; unit test&lt;/li&gt;&lt;li&gt;UAI: user acceptance and integration&amp;nbsp;test&lt;/li&gt;&lt;li&gt;PRD: production&lt;/li&gt;&lt;li&gt;DRP: mirror of production for performance testing&lt;/li&gt;&lt;li&gt;...&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Shouldn't kettle slowly evolve in such a way that the&amp;nbsp;kettle.properties file&amp;nbsp;becomes part of your project code? Or maybe two levels or kettle.properties could exist, the original kettle.properties that contains variables for all your projects, and a project specific one that contains project specific variables. &lt;/li&gt;&lt;li&gt;If variables become project specific, wouldn't it make sense to be able to edit them inside kettle in a grid that shows the different lifecycle environments and value you are using in each environment. As illustrated below:&lt;/li&gt;&lt;/ul&gt;&lt;div style="text-align: center;"&gt;&lt;b&gt;&lt;/b&gt;&lt;br /&gt;&lt;table border="1" style="text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;strong&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;Variable&lt;/span&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;strong&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;DEV&lt;/span&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;strong&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;TST&lt;/span&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;strong&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;UAI&lt;/span&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;strong&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;PRD&lt;/span&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;IN_FILE_DIR&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/DEV/data/in/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/TST/data/in/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/UAI/data/in/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/PRD/data/in/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;DWH_SERVER&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;localhost&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;localhost&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;DWH_UAI&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;DWH_PRD&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;DWH_DATABASE&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;MySQL_DWH_DEV&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;MySQL_DWH_TST&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;MySQL_DWH&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;MySQL_DWH&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;...&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;/td&gt;&lt;td&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;strong&gt;&lt;u&gt;Get it&lt;/u&gt;&lt;/strong&gt;&lt;br /&gt;You can get your copy of the environment configurator&amp;nbsp;on &lt;b&gt;&lt;a href="http://code.kjube.be/"&gt;Google Code&lt;/a&gt;,&lt;/b&gt; in the downloads section.&amp;nbsp;&amp;nbsp;All feedback is appreciated. We are very curious to see which use cases you will find for this plug-in.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-712435348058929337?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/712435348058929337/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/08/kff-environment-configurator.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/712435348058929337'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/712435348058929337'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/08/kff-environment-configurator.html' title='KFF environment configurator'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_C0PnWJwDRZY/THTg2BYvmfI/AAAAAAAAAZw/z56swCtvwUs/s72-c/pic169.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-5094780693982117293</id><published>2010-08-25T08:16:00.002+02:00</published><updated>2010-09-30T21:23:13.727+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>What's cooking backwards compatible?</title><content type='html'>&lt;a href="http://rpbouman.blogspot.com/2010/08/back-to-blogging.html"&gt;&lt;b&gt;Roland Bouman just released the "kettle-cookbook"&lt;/b&gt;&lt;/a&gt; on &lt;a href="http://code.google.com/p/kettle-cookbook"&gt;&lt;b&gt;Google Code&lt;/b&gt;&lt;/a&gt;. At first, if you look at the name, and are a bit used to the kettle kitchen terminology (kettle, kitchen, pan, spoon, chef, ...), you might expect this to be some manual on how to cook up the best data integration jobs and transformations using kettle. But it is something different. The cookbook is an &lt;b&gt;auto documentation tool for kettle&lt;/b&gt; jobs and transformations.&lt;br /&gt;&lt;br /&gt;Since &lt;a href="http://code.kjube.be/"&gt;&lt;b&gt;we&lt;/b&gt;&lt;/a&gt; have just released a new project with over 300 jobs and transformations, we had been looking into documentation ourselves, but it seems Roland beat us to it. Time to give the cookbook a try (and see whether we can contribute).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 1: Installing the cookbook&lt;/b&gt;&lt;br /&gt;... equals unzipping the code into a directory.&amp;nbsp;Since our standard set-up is to have all the re-usables under the same directory, namely /kff/reusable, I unzipped the cookbook under /kjube/reusables/cookbook. No pain here.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 2: Running the cookbook against a directory of transformations/jobs&lt;/b&gt;&lt;br /&gt;I have a &lt;a href="http://code.google.com/p/kettle-franchise/wiki/TemplateDatawarehouseProject"&gt;data warehousing project template&lt;/a&gt; which uses the following root folder:&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/kff/projects/templates/datawarehouse/code/ &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;with the following subdirectory structure and dummy jobs.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/kff/p../code/pre &amp;nbsp; pre-processing&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/kff/p../code/stg &amp;nbsp; jobs to load staging area of the DWH&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/kff/p../code/ods &amp;nbsp; jobs to load ODS area of the DWH&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/kff/p../code/dwh &amp;nbsp; jobs to load multidimensional DWH&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;/kff/p../code/pst &amp;nbsp; post-processing&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;div&gt;So I pointed the cookbook at the &lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;"&lt;span class="Apple-style-span" style="font-size: small;"&gt;/kff/p../code/&lt;/span&gt;"&lt;/span&gt;&lt;/span&gt; directory to get a full documentation of my template project.&lt;br /&gt;&lt;blockquote&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;jaertsen@Jaybox:/kff/software/pdi/3.2.3$ sh kitchen.sh -file:/kff/reusable/cookbook/pdi/document-all.kjb -param:"INPUT_DIR"=/kff/projects/templates/datawarehouse/code/ -param:"OUTPUT_DIR"=/kff/projects/templates/datawarehouse/doc/&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/blockquote&gt;What did I get? Only documentation for the jobs and transformations in the root folder?&amp;nbsp;What happened here?&lt;br /&gt;&lt;br /&gt;After some digging it seems that the first step in the "get-kettle-job-and-trans-files-from-directory.ktr" transformation is a "&lt;a href="http://wiki.pentaho.com/display/EAI/Get+File+Names"&gt;Get File Names&lt;/a&gt;" step. That step differs between the 3.2.x version - which most of our customers are still on - and the 4.0.0 version, released recently. Basically, there is a flag "include subdirectories" in 4.0.0 version, which wasn't there before. So much for backwards compatibility, but that explains my issue.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/TGr_qkhLb6I/AAAAAAAAAY4/ACiw2qe1bfo/s1600/pic153.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="247" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/TGr_qkhLb6I/AAAAAAAAAY4/ACiw2qe1bfo/s400/pic153.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;So, even though I have no intention upgrading any of my customers yet, I ran the cookbook with version 4.0.0.&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;jaertsen@Jaybox:/kff/software/pdi/4.0.0$ sh kitchen.sh -file:/kff/reusable/cookbook/pdi/document-all.kjb -param:"INPUT_DIR"=/kff/projects/templates/datawarehouse/code/ -param:"OUTPUT_DIR"=/kff/projects/templates/datawarehouse/doc/&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;... and problem solved.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;With great thanks to Roland for a wonderful documentation tool. I'll suggest a modification on Google Code for backwards compatibility.&lt;/b&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-5094780693982117293?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/5094780693982117293/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/08/whats-cooking-and-backwards.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5094780693982117293'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5094780693982117293'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/08/whats-cooking-and-backwards.html' title='What&apos;s cooking backwards compatible?'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_C0PnWJwDRZY/TGr_qkhLb6I/AAAAAAAAAY4/ACiw2qe1bfo/s72-c/pic153.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-3992599386323267632</id><published>2010-08-13T11:19:00.016+02:00</published><updated>2010-09-30T21:23:27.550+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - KFF'/><title type='text'>The Data Grid Step</title><content type='html'>&lt;div&gt;Half a year ago I was happy to be involved in helping the folks from kJube get a project on the road.  As the head of data integration at Pentaho I'm very happy to very occasionally be involved in the die-hard work that goes along with the implementation of BI projects because it allows me to touch base with the real world beyond the world of case tracking systems and software architectural dreams.  If you can help a friend out at the same time it makes the occasion even better.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At one point it became clear that defining small data sets in Pentaho Data Integration (Kettle) was not as easy as it should be. Obviously it's nice to sit in an ivory tower and say that all information should be defined externally in a file or database table. However, setting up an XML file for a few lines of data is in a lot of cases over the top.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The question also is this: do you want to have ETL related information close to the ETL user or inside some external data source? Is it easier to manage the few lines of data in a user interface like Spoon or in an XML file somewhere? Answering those questions reveals your preference. In any case, as a result of the discussions and idea exchanges with kJube we decided that there was a need for a new "Data Grid" step in Kettle that would allow the ETL user to enter a few lines of constant data in an easy and clear way.  Since ideas are often more valuable than a few lines of code, the step emerged and was thrown into the project quickly thereafter.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The use-case is the following: you have a number of company subsidiaries that all use the same database structure on different systems.  Because all database structures are the same for these companies we want to create a loop in Kettle.  Here is a screen shot from the actual project:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://4.bp.blogspot.com/_cHg9rXowGtw/TGUSk5TVOUI/AAAAAAAAAWo/KaaJxskkhjU/s1600/loop-job.png"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5504826544596793666" src="http://4.bp.blogspot.com/_cHg9rXowGtw/TGUSk5TVOUI/AAAAAAAAAWo/KaaJxskkhjU/s400/loop-job.png" style="cursor: pointer; display: block; height: 217px; margin-bottom: 10px; margin-left: auto; margin-right: auto; margin-top: 0px; text-align: left; width: 400px;" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;The transformation under "SMORDERCUSLOOP" generates a number of result rows.  These rows are used to loop with in the sub-sequent jobs that are executed in parallel.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So we need a semi-fixed list of companies and systems and we can choose to define these in an external file or database or, as is the case here, in a "Data Grid" step:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_cHg9rXowGtw/TGUTMJwQYJI/AAAAAAAAAWw/MQc_1YjI4zc/s1600/data-grid.png"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5504827219027976338" src="http://4.bp.blogspot.com/_cHg9rXowGtw/TGUTMJwQYJI/AAAAAAAAAWw/MQc_1YjI4zc/s400/data-grid.png" style="cursor: pointer; height: 151px; width: 400px;" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As you can see, defining the set of constant rows is very easy with the new "Data Grid" step.  In fact, everyone involved liked this step so much that it was contributed to the Kettle project.  It is available in version 4 and upward in the "Input" category.   For version 3.2 projects we have made the source code available under the &lt;a href="http://code.google.com/p/kettle-franchise/source/browse/#svn/trunk/plugins/v3"&gt;&lt;b&gt;Kettle Franchise Factory&lt;/b&gt;&lt;/a&gt; (KFF) project umbrella.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Jan &amp;amp; I are going to release more software that resulted from our adventures under the KFF umbrella.  Follow this blog to learn more about &lt;a href="http://code.kjube.be/"&gt;&lt;b&gt;KFF&lt;/b&gt;&lt;/a&gt; and/or more plugins in the near future.  We'll also be more than happy to share project experiences at the upcoming &lt;a href="http://forums.pentaho.com/showthread.php?77165-CGP-Community-Gathering-in-Portugal-2010-Event-Details"&gt;&lt;b&gt;Pentaho Community Even&lt;/b&gt;t&lt;/a&gt; in Lisbon, Portugal.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Matt&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-3992599386323267632?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/3992599386323267632/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/08/data-grid-step.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3992599386323267632'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3992599386323267632'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/08/data-grid-step.html' title='The Data Grid Step'/><author><name>Matt Casters</name><uri>http://www.blogger.com/profile/12263548900215476529</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp2.blogger.com/_cHg9rXowGtw/R6DPxIh4iEI/AAAAAAAAAAQ/a2PKY2431ys/S220/MattGardenSmall.png'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_cHg9rXowGtw/TGUSk5TVOUI/AAAAAAAAAWo/KaaJxskkhjU/s72-c/loop-job.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8282633151617404355</id><published>2010-08-12T17:59:00.002+02:00</published><updated>2010-09-30T21:20:18.509+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>Time for a time-out</title><content type='html'>When you open kettle (sorry Pentaho Data Integration), it is so polite to re-open the transformations and jobs you had open the last time you were working. A nice feature, however ...when kettle tries to open jobs/transformations with connections to a server that isn't available anymore (because I shut down my VPN or I switched location), it will try to find that server. And that can take a long time.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Example&lt;/i&gt;: Below I opened kettle with 3 jobs 'remembered'. Those jobs each contain 4 Oracle connections and 1 AS/400 connection. All connections are shared. The time that passes between launching kettle and getting the three error boxes is about &lt;a href="http://en.wikipedia.org/wiki/Mississippi"&gt;54 Mississipi&lt;/a&gt;&amp;nbsp;&amp;nbsp;:-). In the meantime you cannot do anything in the kettle interface. It doesn't react to anything until all error messages are on the screen. Knowing that I sometimes work on over 10 jobs/transformations, that sometimes makes a &lt;a href="http://en.wikipedia.org/wiki/Mississippi_River"&gt;very long Mississipi&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;In case you are confronted with this issue from time to time, the trick to avoid this kind of behaviour is so ensure that you have no network connection at all when you start up PDI. Unplug the cable or disable your wifi for a second, and that'll solve the issue. No more frustrated waiting.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TGQP1rMzt2I/AAAAAAAAAYY/bIAj8fmm-tM/s1600/pic144.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TGQP1rMzt2I/AAAAAAAAAYY/bIAj8fmm-tM/s640/pic144.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In the meantime I have posted a &lt;a href="http://jira.pentaho.com/browse/PDI-4434"&gt;&lt;b&gt;Jira&lt;/b&gt;&lt;/a&gt; to ask the folks at &lt;b&gt;&lt;a href="http://www.pentaho.com/"&gt;Pentaho&lt;/a&gt;&lt;/b&gt; to make sure the time out happens a little bit sooner. A minor bug, in &lt;a href="http://www.gartner.com/technology/media-products/reprints/oracle/article109/article109.html"&gt;one of the major data integration tools&lt;/a&gt;&amp;nbsp;[I tend to disagree with the Gartner boyz]:-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8282633151617404355?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8282633151617404355/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/08/time-for-time-out.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8282633151617404355'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8282633151617404355'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/08/time-for-time-out.html' title='Time for a time-out'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/TGQP1rMzt2I/AAAAAAAAAYY/bIAj8fmm-tM/s72-c/pic144.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6465358288601389292</id><published>2010-08-10T15:39:00.005+02:00</published><updated>2010-09-30T21:41:14.165+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration'/><title type='text'>Release 4.666 (aka bitter pain edition)</title><content type='html'>As software grows, &lt;b&gt;software grows old&lt;/b&gt;. Time has an impact on everything. It is inevitable, even for intangible things as software. Sooner or later the beautiful piece of software you are working with today, will start to fall apart. Maybe the aging will start to show in the user interface. Little wrinkles will show that the &lt;b&gt;GUI&lt;/b&gt; isn't able to keep up with 'fashion'. Maybe you'll notice the product getting a little slow in understanding, having a hard time to grasp the&lt;b&gt; latest&amp;nbsp;&lt;/b&gt; &lt;b&gt;standards, trends and concepts&lt;/b&gt;. &lt;br /&gt;&lt;br /&gt;But the real aging of software lies under the skin, in &lt;b&gt;the code&lt;/b&gt; that gets clogged up with more lines and lines of extra code where maybe a re-write was in order. Quick fixes, not exactly written as they should have been, but that no one dares to touch again afterward. Little modifications in the code to close that sales deal with customer x. Some extra lines to make everything go smooth on platform y. An extra call to avoid a crash if z occurs. If you have been involved in software development, you know this is all inevitable.&lt;br /&gt;&lt;br /&gt;So sooner or later any software needs to be replaced. There is no way around that. &lt;br /&gt;&lt;br /&gt;I believe it is clear that &lt;b&gt;many of the existing (proprietary) BI/DWH tools on the market have reached the point where there product needs (or needed) a rewrite&lt;/b&gt;. Some examples include the following.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;b&gt;Business Objects&lt;/b&gt; have rewritten their immensely popular reporting product from the ground up. Business Objects 6.5 was the last "old release", Business Objects XI is the new release.&lt;/li&gt;&lt;li&gt;IBM have (finally) rewritten Ascential's &lt;b&gt;Datastage&lt;/b&gt; (after their acquisition of the product at the beginning of this century). They have integrated it into the IBM Websphere family. Datastage 7.5 was the last release of the old breed.&amp;nbsp; Infosphere Datastage 8.0 is the new product release. &lt;/li&gt;&lt;li&gt;&lt;b&gt;Oracle&lt;/b&gt; is coming up soon with a new version of their data integration tool. &lt;b&gt;Oracle Warehouse Builder&lt;/b&gt; will cease to exist after 11g and will be replaced with Oracle Data Integration (a cross over between OWB and Sunopsis).&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Probably there are few more examples out there, but already these three share some striking similarities:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;All of the existing products have been around since before the year 2000, became very popular, even &lt;b&gt;market leaders&lt;/b&gt; in the BI or data integration market segment, and gained a large customer base;&lt;/li&gt;&lt;li&gt;In each case the vendor has done a complete or significant product rewrite in order to assure they have a product that can keep up with market demands;&lt;/li&gt;&lt;li&gt;Marketing wise, none of the vendors really choose to position the new product as such on the market, they all positioned the new software as an &lt;b&gt;upgrade&lt;/b&gt; of the existing product. IBM choose to maintain the product name and even version numbering, Business Objects kept the name and went from version 6.5 to XI. Oracle positions the rewrite of OWB as a merging of the best features of OWB and Sunopsis, while in reality not much will be left of OWB;&lt;/li&gt;&lt;li&gt;Last but not least, in all cases the 'upgrade' scenario doesn't drill down to an actual upgrade, but rather results in a &lt;b&gt;costly (and painfull) migration scenario&lt;/b&gt; to a new software;&lt;/li&gt;&lt;li&gt;In all too many cases &lt;b&gt;customers blindly accept&lt;/b&gt;ed the upgrade and paid for costs.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Indeed, many customers have invested so much into these technologies that they cannot or dare not imagine a life without this software. But &lt;b style="color: #4c1130;"&gt;are these customers truly locked in&lt;/b&gt; by the years and years of investment in this software, IF the upgrade really is a migration? If you need to migrate, why not migrate to another platform. &lt;br /&gt;&lt;br /&gt;Almost 2 years ago we published a white paper (that is still &lt;i&gt;&lt;b&gt;&lt;a href="https://docs.google.com/fileview?id=0B7pEch_luF0xZWYwMTIxYmItYzZkMS00NzFiLTg5YzAtNWYwNjQ0YjFhNjQz&amp;amp;hl=en_GB&amp;amp;utm_source=kJube&amp;amp;utm_medium=Blog&amp;amp;utm_campaign=Open%2BSource%2BData%2BIntegration%2BChallenge"&gt;available on our website&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;). This paper showed the cost differences in proprietary (Oracle, Microsoft) versus open source (Pentaho) data integration software. (( I believe we were slightly ahead of &lt;i&gt;&lt;b&gt;&lt;a href="http://kjube.blogspot.com/2010/05/lower-costs-with-open-source-bi.html"&gt;Mark Madsen&lt;/a&gt;&lt;/b&gt;&lt;/i&gt;, although our study was only a three pager of course&amp;nbsp; ;) )).&amp;nbsp; Anyhow in this study we also discussed the migration scenario. (Read the &lt;i&gt;&lt;b&gt;&lt;a href="https://docs.google.com/fileview?id=0B7pEch_luF0xZWYwMTIxYmItYzZkMS00NzFiLTg5YzAtNWYwNjQ0YjFhNjQz&amp;amp;hl=en_GB&amp;amp;utm_source=kJube&amp;amp;utm_medium=Website&amp;amp;utm_campaign=Open%2BSource%2BData%2BIntegration%2BChallenge&amp;amp;authkey=CNG6utAF"&gt;white paper&lt;/a&gt;&lt;/b&gt;&lt;/i&gt; if you want the details.)&lt;br /&gt;&lt;br /&gt;Basically we discussed 3 scenario's:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Scenario 1: &lt;b&gt;Remain closed source&lt;/b&gt;, in other words, stay with your proprietary vendor and pay for the extra/new licenses you might need in the future. That means no need to rework anything. Pay license costs where needed, and build new functionalities as you need them.&lt;/li&gt;&lt;li&gt;Scenario 2: &lt;b&gt;Go open source&lt;/b&gt;. Rip and replace! Remove your proprietary solution and replace it completely with open source. Throw away your old licenses (kind of a cost saver) but rewrite everything you have (or still need).&lt;/li&gt;&lt;li&gt;Scenario 3: &lt;b&gt;Have both&lt;/b&gt;. Put open source next to the proprietary solution for the extra/new functionality and go for a slow migration. You are stuck with your license cost, but will not buy new functionality (remain with old version of the software). I the mean time you deploy open source next to it.&lt;/li&gt;&lt;/ul&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TGEhyNRAUbI/AAAAAAAAAYQ/CoYOA6cvSrc/s1600/chart2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="207" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TGEhyNRAUbI/AAAAAAAAAYQ/CoYOA6cvSrc/s400/chart2.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Basically the conclusion back then was that often &lt;b&gt;the best scenario was to go for open source along side closed source because a rip and replace was too expensive&lt;/b&gt;. Basically the migration cost for rewriting your ETL jobs (or reports) - that is, the effort of your IT resources (or even off-shore resources) -&amp;nbsp; makes a business case for rip and replace very expensive (on the short term). In the long run, rip and replace is the cheapest, but where's the CIO that will go for a project that has the best ROI only after 5 years?&lt;br /&gt;&lt;br /&gt;I believe it is clear to everyone that &lt;b&gt;in the case of the "painful upgrade", the above charts change a bit&lt;/b&gt;. Due to the fact that the upgrade really is a migration cost, we'll see the red curve as well as the purple&amp;nbsp; shifting up in the first two years. Remaining with your proprietary vendor will cost a big effort from your IT resources, just as migrating to a new software would. &lt;b&gt;In that case, clearly option number two, rip and replace, becomes the best scenario.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="color: #4c1130;"&gt;So, please, if you are one of those customers, that feels cornered by a company proposing you a product upgrade which actually is the release from hell. Consider a real migration as a serious alternative option.&lt;/span&gt;&lt;/b&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6465358288601389292?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6465358288601389292/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/08/release-4666-aka-bitter-pain-edition.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6465358288601389292'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6465358288601389292'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/08/release-4666-aka-bitter-pain-edition.html' title='Release 4.666 (aka bitter pain edition)'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/TGEhyNRAUbI/AAAAAAAAAYQ/CoYOA6cvSrc/s72-c/chart2.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7596855007906079071</id><published>2010-08-09T18:01:00.002+02:00</published><updated>2010-09-30T21:34:31.071+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence'/><title type='text'>The one billion dollar query</title><content type='html'>... or &lt;strong&gt;the cost of bad data warehouse design.&lt;/strong&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt;(last week an IT resource of one of our customers told me this story.)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Since some time we are working with a customer to improve their data warehouse architecture. The top management of this company has been ignoring the need for data warehouse solutions for at least 10 years. Sure, investments have been made, but always on a departmental or project basis. No one ever bothered looking a the big picture. Different data warehouse architects were present (or not)&amp;nbsp;over time, each using different design techniques. All initiatives&amp;nbsp;have resulted into so called "information silo's". The consequence is a heterogenous mix of copies of operational tables, the thing called 'operational data store', data warehouses&amp;nbsp;of different types and OLAP cubes. The whole of this runs on an IBM mainframe combined with a Windows Server running SAS 9.1.&lt;br /&gt;&lt;br /&gt;The result of this &lt;strong&gt;&lt;span style="color: #4c1130;"&gt;data warehouse back-end fiasco&lt;/span&gt;&lt;/strong&gt;, is that end-users within the company have taken things into their own hands and started building their own solutions. Some use Excel to elaborate complex analytical constructions, others have gained near-admin access to the SAS reporting server(s) so they can actually build their own OLAP cubes and reports. IT has lost control of what is going on and currently IT hardly has any feeling with what real business intellingence needs are.&lt;br /&gt;&lt;br /&gt;For some time now we've been trying to inventorize all the problems and issues there are&amp;nbsp;today,&amp;nbsp;and what the cost is for the company. A hard task, as many of the costs are hidden. The following&amp;nbsp;"cost example" was a little gem that showed up just because one of the IT guys (girl actually) - who has put a lot of effort&amp;nbsp;in&amp;nbsp;listing out all the issues and costs -&amp;nbsp;bothered to&amp;nbsp;manually monitor user activity for some time.&lt;br /&gt;&lt;br /&gt;Apparantly what happened is that a&amp;nbsp;user has/had been running a&amp;nbsp;query, through QMF,&amp;nbsp;on a part of the data warehouse with a particularly bad data model design.&amp;nbsp;The result was &lt;span style="color: #7f6000;"&gt;&lt;strong&gt;&lt;span style="color: #4c1130;"&gt;a query that used up around 30.000 CPU seconds, running in a job class with top priority&lt;/span&gt;&lt;/strong&gt; &lt;/span&gt;&lt;span style="color: black;"&gt;and no limitations (in other words, the mainframe administrator&lt;/span&gt; believed&amp;nbsp;that that job actually had the right to consume all those resources), so if she wouldn't have stopped it, it would have continued to run (until the user bothered to end it).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;JOBNAME&amp;nbsp;&amp;nbsp; STEPNAME&amp;nbsp; PROCSTEP&amp;nbsp; JOBID&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; OWNER&amp;nbsp;&amp;nbsp;&amp;nbsp; C&amp;nbsp; SIO&amp;nbsp;&amp;nbsp;&amp;nbsp; CPU%&amp;nbsp;&amp;nbsp;&amp;nbsp; CPU-TIME &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;D003359Q&amp;nbsp; DB2CQMFB&amp;nbsp; DB2CQMFB&amp;nbsp; JOB12332&amp;nbsp; D003359&amp;nbsp; R&amp;nbsp; 0.00&amp;nbsp;&amp;nbsp; 15.71&amp;nbsp;&amp;nbsp;&amp;nbsp;29112.05&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I guess, most of you, also the ones that&amp;nbsp;never work on mainframe,&amp;nbsp;do realize that mainframe computing costs are high. So does the customer. They already moved from an "individual mainframe" to a 'hosted mode' in order to&amp;nbsp;save&amp;nbsp;costs. And still their infrastructure costs are high. &lt;br /&gt;&amp;nbsp; &lt;br /&gt;Anyhow to make a long story short, I called the hosting company, to inquire what might be the approximate&amp;nbsp;cost of this particular&amp;nbsp;query -&amp;nbsp;which ironically never returned any result because we had it&amp;nbsp;killed. I'm not naming any figures, but the number I was given, was as high a &lt;strong&gt;&lt;u&gt;&lt;span style="color: #4c1130;"&gt;the net monthly salary of the IT analyst&lt;/span&gt;&lt;/u&gt;&lt;/strong&gt; that discovered the issue. &lt;br /&gt;&amp;nbsp; &lt;br /&gt;Some inquiry with the users, learned us that they actually launch this (type of) query regularly at the end of the month. They estimated somewhere between 5 to 10 of these queries per month. I leave the maths up to you! Anyhow, I believe this to be good material to go and talk to the management of this company and tell them a story about &lt;strong&gt;&lt;span style="color: #741b47;"&gt;&lt;a href="http://www.opensourcebusinessintelligence.be/sit/index.php?section=12"&gt;the benefits of good data modeling and architecture design&lt;/a&gt;&lt;/span&gt;&lt;/strong&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7596855007906079071?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7596855007906079071/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/08/one-billion-dollar-query.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7596855007906079071'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7596855007906079071'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/08/one-billion-dollar-query.html' title='The one billion dollar query'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-9043417995203471956</id><published>2010-08-06T13:05:00.002+02:00</published><updated>2010-09-30T21:23:50.090+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>Visiting customers ...</title><content type='html'>&lt;div class="mobile-photo"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TFvsalvN00I/AAAAAAAAAYI/hTKeu4FE5Ao/s1600/IMAG0413-713889.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5502251311314096962" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TFvsalvN00I/AAAAAAAAAYI/hTKeu4FE5Ao/s320/IMAG0413-713889.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;... should not exclude fun.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-9043417995203471956?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/9043417995203471956/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/08/visiting-customers.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/9043417995203471956'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/9043417995203471956'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/08/visiting-customers.html' title='Visiting customers ...'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_C0PnWJwDRZY/TFvsalvN00I/AAAAAAAAAYI/hTKeu4FE5Ao/s72-c/IMAG0413-713889.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-1099889499202220374</id><published>2010-07-31T00:16:00.001+02:00</published><updated>2010-09-30T21:19:54.930+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>The holy war: iPhone vs Android</title><content type='html'>&lt;div style="text-align: justify;"&gt;While I'm still in Italy, enjoying the sun, I couldn't resist copying Umberto Eco's famous article on &lt;a href="http://www.themodernword.com/eco/eco_mac_vs_pc.html"&gt;Mac vs Dos&lt;/a&gt;. Not very original of me. Even more after modifying some of the first lines to make the whole thing a little more actual, I just gave up trying to be as witty as Umberto Eco. He out-intellectuals me a 100.000 times even while he sleeps. So if you don't like this, just conclude I took too much sun.&lt;br /&gt;&lt;br /&gt;--&lt;br /&gt;&lt;br /&gt;Dear friends, earthlings, gadgetlovers, nerds and freaks, I guess it's finally time to reach a final decision on what keps us busy for many days, weeks and some of you months. Is the earth flat, almost flat, a bit round, kind of a bal or a perfect &lt;a href="http://en.wikipedia.org/wiki/Sphere"&gt;sphere&lt;/a&gt;? Are we better of without a government, as &lt;a href="http://en.wikipedia.org/wiki/Thomas_Hobbes"&gt;Hobbes&lt;/a&gt; said (and many &lt;a href="http://en.wikipedia.org/wiki/2007%E2%80%932010_Belgian_political_crisis"&gt;Belgians experience daily&lt;/a&gt;), or is Hobbes just a &lt;a href="http://en.wikipedia.org/wiki/Calvin_and_Hobbes"&gt;tiger&lt;/a&gt; that comes alive when another imaginary character thinks he does? Who's really the &lt;a href="http://en.wikipedia.org/wiki/Vladimir_Putin"&gt;president of Russia&lt;/a&gt;, has he been gone, or &lt;a href="http://en.wikipedia.org/wiki/I%27ll_be_back"&gt;will he be back&lt;/a&gt; and is anyone really waiting for him? Does the iPod and/or kindle kill iBooks and video on demand or rather do they fuel it? Will Twitter replace the phone? Whether computers kill inspiration or do they just inspire you to copy copy copy &lt;a href="http://umbertoecoreaders.blogspot.com/2007/11/holy-war-mac-vs-dos.html"&gt;what Umberto wrote&lt;/a&gt;?&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;One can continue with: whether &lt;a href="http://fi.wikipedia.org/wiki/Nostradamus"&gt;Nostradamus&lt;/a&gt; was a terrorist; whether &lt;a href="http://bs.wikipedia.org/wiki/Barack_Obama"&gt;Obama&lt;/a&gt; will start driving an electric car or will America introduce&amp;nbsp;genetically modified,&amp;nbsp;petrol&amp;nbsp;fueled fish and seabirds? Will more &lt;a href="http://en.wikipedia.org/wiki/Italian_people#Autochthonous_communities_outside_Italy"&gt;Italians migrate to Belgium&lt;/a&gt; now that not only the &lt;a href="http://it.wikipedia.org/wiki/Paola_Ruffo_di_Calabria"&gt;queen&lt;/a&gt; but also the &lt;a href="http://fr.wikipedia.org/wiki/Elio_Di_Rupo"&gt;premier&lt;/a&gt; will be or Italian origin?&amp;nbsp;Insufficient consideration has been given to the new underground religious war which is modifying the modern world. It's an old idea of mine, but I find that whenever I tell people about it they immediately agree with me.&lt;/div&gt;&lt;br /&gt;The fact is that the world is starting to get divided between users of the &lt;a href="http://www.apple.com/iphone/"&gt;iPhone&lt;/a&gt; and users of &lt;a href="http://www.android.com/"&gt;Android&lt;/a&gt; phones. I am firmly of the opinion that the iPhone is Catholic and that Android is Protestant.&amp;nbsp;Indeed, the iPhone is counter-reformist and has been influenced by the&amp;nbsp;&lt;i&gt;&lt;a href="http://www.blogger.com/goog_2032346641"&gt;ratio studiorum&amp;nbsp;&lt;/a&gt;&lt;/i&gt;&lt;a href="http://en.wikipedia.org/wiki/Ratio_Studiorum"&gt;of the Jesuits&lt;/a&gt;. It is cheerful, friendly, conciliatory; it tells the faithful how they must proceed step by step to reach -- if not the kingdom of Heaven -- the moment in which their document is printed. It is catechistic: The essence of revelation is dealt with via simple formulae and sumptuous (although beautiful) icons. Everyone has a right to salvation.&lt;br /&gt;&lt;br /&gt;The Linux powered phone is Protestant, or even Calvinistic. It allows free interpretation of scripture, demands difficult personal decisions, imposes a subtle hermeneutics upon the user, and takes for granted the idea that not all can achieve salvation. To make the system work you need to interpret the program yourself: Far away from the baroque community of revelers, the user is closed within the loneliness of his own inner torment.&lt;br /&gt;&lt;br /&gt;You may object that, with the passage to Android, the Linux power phone universe has come to resemble more closely the counter-reformist tolerance of the iPhone. It's true: Android represents an Anglican-style schism, big ceremonies in the cathedral, but there is always the possibility of a return to Linux to change things in accordance with bizarre decisions: When it comes down to it, you can decide to ordain women and gays if you want to.&lt;br /&gt;&lt;br /&gt;Naturally, the Catholicism and Protestantism of the two systems have nothing to do with the cultural and religious positions of their users. One may wonder whether, as time goes by, the use of one system rather than another leads to profound inner changes. Can you use Android and still be a fan of &lt;a href="http://www.geeksugar.com/Megan-Fox-Wears-Star-Wars-T-Shirt-Carries-White-iPhone-1791588"&gt;Megan Fox&lt;/a&gt;? And more: Would Cicero have communicated using &lt;a href="http://seesmic.com/"&gt;Seesmic&lt;/a&gt;, &lt;a href="http://www.hootsuite.com/"&gt;HootSuite&lt;/a&gt; or&amp;nbsp;&lt;a href="http://twidroyd.com/"&gt;Twitdroid&lt;/a&gt;? Would Descartes have programmed in for the iPhone store or for the Android market?&lt;br /&gt;&lt;br /&gt;And machine code, which lies beneath and decides the destiny of both systems (or environments, if you prefer)? Ah, that belongs to the Old Testament, and is talmudic and cabalistic. The Jewish lobby, as always....&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-1099889499202220374?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/1099889499202220374/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/07/holy-war-iphone-vs-android.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1099889499202220374'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1099889499202220374'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/07/holy-war-iphone-vs-android.html' title='The holy war: iPhone vs Android'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-5285219069114084001</id><published>2010-07-30T07:11:00.016+02:00</published><updated>2010-09-30T21:24:03.846+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - KFF'/><title type='text'>Converting AS/400 (RPG) dates using kettle</title><content type='html'>I'm not an RPG programmer. I don't even have a basic understanding of RPG, however as a BI.DWH architect I have come across a few AS400 applications written in RPG. A recurring phenomenon seems to me to split up date and date/time fields in separate numeric fields.&lt;br /&gt;&lt;br /&gt;To store a date/time it seems a common practice in RPG is to create 7 numerical fields, each of maximum 2 positions. Example:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: small;"&gt;DTCRCA8: Record creation date.time - century part&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: small;"&gt;DTCRYA8: Record creation date.time - year part&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: small;"&gt;DTCRMA8: Record creation date/time - month part&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: small;"&gt;DTCRDA8: Record creation date.time - day part&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: small;"&gt;HRCRHA8: Record creation date.time - hour part&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: small;"&gt;HRCRMA8: Record creation date.time - minutes part&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace; font-size: small;"&gt;HRCRSA8: Record creation date.time - seconds part&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;So I've come across this type of date structures in an AS400 database and needed to read those columns to transform them into a date format that could be stored in a MySQL or Oracle database.&lt;br /&gt;&lt;br /&gt;To read out this information isn't very hard using kettle. Creating a connection to an AS400 system is standard connectivity in PDI. And a Javascript step with some simple functions will take care of merging the seven fields to one data field. However a number of data quality issues can arise with this type of date structures in AS400 and that is where the coding becomes tedious.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Obviously, with the database fields being integer, any value could occur in these fields.&amp;nbsp;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;You could values in the century field that range from anywhere between 0 and 99. Most likely only the values 19 or 20 are correct to you, unless you are reading out information from a database for archeological purposes.&amp;nbsp;&lt;/li&gt;&lt;li&gt;You could have the hour field contain values like 24, 25, 26, ... . The minutes or seconds fields values of 60 and above. What about the 67th month of the year? Catch my drift?&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;What could also happen is one or more of the fields being blank. How would you translate this into a useful date? &amp;nbsp;Century: 20 - Year: 3 &amp;nbsp;- Month: null &amp;nbsp;- Day: 15. &amp;nbsp;January 15th 2003? February 15th 2003? Pick your pick. &amp;nbsp;Obviously the field shouldn't be null. A regular zero poses the same challenge.&lt;/li&gt;&lt;li&gt;What do you do when the date fraction is correct, but the time part isn't? Or vice versa?&lt;/li&gt;&lt;li&gt;Sometimes the programmer who wrote the RPG program might have thought it enough to put it 9 and 0 for the century. I've seen RPG programs where only one digit was dedicated to the century, so it just depends on the RPG specialist that passed by on what your program might write down. So don't be amazed to find both the values 20 and 0 in the century field, both meaning the same.&lt;/li&gt;&lt;li&gt;Is 24 a valid hour? If so, should you add a day to the date fraction?&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Once you've gone down the road of the handling the date conversion with some Javascript in PDI you risk to have to modify your Javascript in a growing series of transformations in which you are using dates. And since dates are pretty common in most applications, you are bound to do conversions in many of your transformations. If a concept as "record creation date/time" is used in the database design, you'll run into at least one date conversion for every single table you are extracting.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As I wrote, we (&lt;a href="http://www.kjube.be/"&gt;kJube&lt;/a&gt;) have come across the problem in some projects. The way we tackled this is by writing a custom plugin for PDI, which handles the conversion including all data quality checks. It checks the value ranges and defines a standard way of handling exceptions.&amp;nbsp;The result looks extremly simple.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TFGep2HtBfI/AAAAAAAAAYA/GIagVG7pZc0/s1600/pic140.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="175" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TFGep2HtBfI/AAAAAAAAAYA/GIagVG7pZc0/s400/pic140.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the simplicity lies the immediate advantage of the plugin:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Anyone can use this logic without any need for whatever understanding of coding, even the simplest of Java scripts.&lt;/li&gt;&lt;li&gt;You can even use drop down lists to select the incoming fields from a list, eliminating the probability of typos.&lt;/li&gt;&lt;li&gt;All date.time conversions will be done in exactly the same way.&lt;/li&gt;&lt;li&gt;Writing date.time conversions with this plugin is a time saver. If you need to extract 200 archives (tables) with an average of 2 date fields per archive, you have just saved yourself writing 400 times the same formula over and over again.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Additionally I guess you'd also gain some performance in using Java coding over Javascript to do this conversion. Especially over large volumes and with complex logic to check/correct the data quality of the dates, that could mean something.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I believe this to be a clear demonstration of the value of the plugin system that PDI offers for simplifying data integration work. It is this type of features that lower project cost as well as system maintenance cost.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For those interested in the plugin (or any extension or modified version of it), don't hesitate to &lt;a href="http://www.kjube.be/"&gt;contact us&lt;/a&gt;. &amp;nbsp;&amp;nbsp;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-5285219069114084001?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/5285219069114084001/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/07/converting-as400-rpg-dates-using-kettle.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5285219069114084001'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5285219069114084001'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/07/converting-as400-rpg-dates-using-kettle.html' title='Converting AS/400 (RPG) dates using kettle'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/TFGep2HtBfI/AAAAAAAAAYA/GIagVG7pZc0/s72-c/pic140.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-2163419335762609518</id><published>2010-07-29T13:14:00.006+02:00</published><updated>2010-09-30T21:41:26.397+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration'/><title type='text'>Datastage vs Pentaho popularity</title><content type='html'>Over the last few months we've been working hard on a rip and replace project to migrate a customer from IBM Datastage to Pentaho Data Integration. Hard work, but it was interesting to see that it can be done, even with a business case that shows payback within a year. (More about that later.)&lt;br /&gt;&lt;br /&gt;Anyhow, having found a customer that wants to leave behind Datastage (a solid tool that I've used on multiple projects in the past) to revert to an open source alternative as Pentaho Data Integration (which continues to gain "followers"), this project made we wonder about how both tools compare in popularity.&amp;nbsp;Roland Bouman wrote a &lt;a href="http://rpbouman.blogspot.com/2010/05/mysql-oracle-and-nosql-in-grand-scheme.html"&gt;blog-post&lt;/a&gt;&amp;nbsp;a few days ago comparing the Oracle to MySQL (as well as a few other databases) using Google Trends. I did the same thing and that turned up these results.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/TFFc3BRlveI/AAAAAAAAAXg/Iy4G3dWUDRE/s1600/datastage.versus.pentaho.popularity.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="177" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/TFFc3BRlveI/AAAAAAAAAXg/Iy4G3dWUDRE/s400/datastage.versus.pentaho.popularity.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;It would seem that Pentaho Data Integration (or rather Pentaho as a whole, because PDI isn't really marketed separately) has overpassed Datastage search volumes somewhere late 2007, beginning 2008. Actually much earlier than I would have thought.&lt;br /&gt;&lt;br /&gt;Since the result surprised me, I went back to Roland's blog and checked out the comments. Many people suggested that there were better statistics.&lt;br /&gt;&lt;br /&gt;So I checked Google Insight with the following query:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TFFg8wYUAdI/AAAAAAAAAXo/i0H0_T6yjxg/s1600/pic137.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="89" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TFFg8wYUAdI/AAAAAAAAAXo/i0H0_T6yjxg/s640/pic137.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The results were even more outspoken:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/TFFhVZZfiCI/AAAAAAAAAXw/v_RuUyhVrss/s1600/pic138.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="232" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/TFFhVZZfiCI/AAAAAAAAAXw/v_RuUyhVrss/s400/pic138.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Finally I checked out &lt;a href="http://odata.stackexchange.com/stackoverflow/q/8541/etl-vendors-by-popularity"&gt;StackExchange&lt;/a&gt;, (even including Informatica this time) again with striking results on the popularity of &lt;a href="http://www.pentaho.com/"&gt;Pentaho&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/TFFh7wGQjNI/AAAAAAAAAX4/rAtSE54LPwI/s1600/pic139.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="137" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/TFFh7wGQjNI/AAAAAAAAAX4/rAtSE54LPwI/s400/pic139.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;I guess &lt;a href="http://www.kjube.be/"&gt;kJube&lt;/a&gt; has been on track with the new trends, doing all those Pentaho projects over the last 5 years.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-2163419335762609518?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/2163419335762609518/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/07/datastage-vs-pentaho-popularity_29.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2163419335762609518'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2163419335762609518'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/07/datastage-vs-pentaho-popularity_29.html' title='Datastage vs Pentaho popularity'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_C0PnWJwDRZY/TFFc3BRlveI/AAAAAAAAAXg/Iy4G3dWUDRE/s72-c/datastage.versus.pentaho.popularity.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-1095840566497870549</id><published>2010-06-11T12:38:00.003+02:00</published><updated>2010-09-30T21:27:51.449+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>New template</title><content type='html'>All,&lt;br /&gt;As you can see, we updated our blog look.&lt;br /&gt;With thanks to Abu Farhan for the nice &lt;a href="http://www.abu-farhan.com/2010/05/accordion-template-for-blogger/"&gt;accordion template&lt;/a&gt;.&lt;br /&gt;Jan&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-1095840566497870549?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/1095840566497870549/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/06/new-template.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1095840566497870549'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1095840566497870549'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/06/new-template.html' title='New template'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8338162464279295207</id><published>2010-05-27T18:22:00.004+02:00</published><updated>2010-09-30T21:27:11.893+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence'/><title type='text'>Lower Costs With Open Source BI</title><content type='html'>&lt;div style="border: medium none;"&gt;Industry analyst Mark Madsen just released a very interesting report on the cost of open source business intelligence based on profound research of license cost of main BI vendors.&lt;/div&gt;&lt;div style="border: medium none;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="border: medium none; clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/S_6eL3DlMoI/AAAAAAAAAMM/tNAudPpL-vM/s1600/MarkMaddson1.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" gu="true" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/S_6eL3DlMoI/AAAAAAAAAMM/tNAudPpL-vM/s320/MarkMaddson1.PNG" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div align="left" class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="" style="border: medium none; clear: both; text-align: left;"&gt;How about this for cost savings per user ?&lt;/div&gt;&lt;div align="left" class="separator" style="border: medium none; clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/S_6eQMVomAI/AAAAAAAAAMU/eN2oCdYlyP8/s1600/MarkMaddson.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" gu="true" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/S_6eQMVomAI/AAAAAAAAAMU/eN2oCdYlyP8/s320/MarkMaddson.PNG" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div align="left" class="separator" style="border: medium none; clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="border: medium none; clear: both; text-align: left;"&gt;Full report &lt;a href="http://docs.google.com/fileview?id=1cg03q5uhKIhrpPKXyPjTxbZZAwPY1iA1NL46-6g_d5w4d3Em-sWBibu7FPpX&amp;amp;hl=en_GB"&gt;here&lt;/a&gt;.&lt;/div&gt;&lt;div style="border: medium none;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8338162464279295207?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8338162464279295207/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/lower-costs-with-open-source-bi.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8338162464279295207'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8338162464279295207'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/lower-costs-with-open-source-bi.html' title='Lower Costs With Open Source BI'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_C0PnWJwDRZY/S_6eL3DlMoI/AAAAAAAAAMM/tNAudPpL-vM/s72-c/MarkMaddson1.PNG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8672540710853696033</id><published>2010-05-26T09:21:00.005+02:00</published><updated>2010-09-30T21:38:08.834+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>kettle evolution</title><content type='html'>kettle has surely evolved since it sprang into existence out of the mind of the Matt Casters.&lt;br /&gt;&lt;br /&gt;Just how much you can see here  :-)&lt;br /&gt;&lt;br /&gt;&lt;object height="505" width="640"&gt;&lt;param name="movie" value="http://www.youtube.com/v/OVPqEY3d70s&amp;hl=en_US&amp;fs=1&amp;rel=0"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/OVPqEY3d70s&amp;hl=en_US&amp;fs=1&amp;rel=0" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="505"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;A great thanks to all of the contributors. Since I work on an almost daily basis with kettle, it's great to see so much involvement and enthousiasm from so many people.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8672540710853696033?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8672540710853696033/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/kettle-evolution.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8672540710853696033'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8672540710853696033'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/kettle-evolution.html' title='kettle evolution'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7862023022866246628</id><published>2010-05-25T12:06:00.001+02:00</published><updated>2010-06-30T16:02:03.570+02:00</updated><title type='text'>Motivational economics</title><content type='html'>A tale about motivation and open source.&lt;br /&gt;&lt;br /&gt;&lt;object height="505" width="853"&gt;&lt;param name="movie" value="http://www.youtube.com/v/u6XAPnuFjJc&amp;hl=en_US&amp;fs=1&amp;rel=0&amp;color1=0x402061&amp;color2=0x9461ca"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/u6XAPnuFjJc&amp;hl=en_US&amp;fs=1&amp;rel=0&amp;color1=0x402061&amp;color2=0x9461ca" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="853" height="505"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7862023022866246628?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7862023022866246628/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/motivational-economics.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7862023022866246628'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7862023022866246628'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/motivational-economics.html' title='Motivational economics'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-1311188902450233841</id><published>2010-05-25T10:28:00.002+02:00</published><updated>2010-05-25T11:51:23.946+02:00</updated><title type='text'>my Dantesque travel through Roma Termini Metro Hell</title><content type='html'>&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;Due to works on the &lt;a href="http://http//en.wikipedia.org/wiki/Roma_Termini_railway_station"&gt;Termini metro station&lt;/a&gt; travellers are required to confront a great deal of stairs (at least 300 to my count of which none automatic) to reach the A line. Since there are only 2 (!!) metrolines, A and B, in Rome, a quick calculation learns me that 50% of Rome's metro has become hostile territory for the physically disabled, elderly, mothers with children, etc.&lt;br /&gt;&lt;br /&gt;For the young, healthy, fitness center inscribed  among us it means a lot of sweat and curse, that is if you are not used to a 30 degree celsius, damp, poorly ventilated and overcrowded metro. And to my knowledge only Romans fall into the category of people who actually are used to this kind of torture.&lt;br /&gt;&lt;br /&gt;Obviously your investment in public transport is naturally prolongued by the general popularity (read lack of affordable alternatives) of this very Italian contemporary adaptation of &lt;a href="http://en.wikipedia.org/wiki/Divine_Comedy"&gt;Dante's Inferno&lt;/a&gt;. There are at least 3 or 4 circles of damned to pass before your gentle guide allows you to enter the vehicle.&lt;br /&gt;&lt;br /&gt;&lt;div class="mobile-photo" style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/S_uLD2DNm8I/AAAAAAAAAL8/4VRr3rpTtxY/s1600/RomaTERMINImetroHELL-735238.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5475122670163827650" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/S_uLD2DNm8I/AAAAAAAAAL8/4VRr3rpTtxY/s320/RomaTERMINImetroHELL-735238.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;With the summer coming and temperatures rising the experience will become all the more dazzling.&lt;br /&gt;I cannot wait to suck up more Roman culture. Stay tuned for more ... unless some one pushes me on the rails in the next few minutes.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div style="text-align: center;"&gt;Nel mezzo del cammin verso lavoro &lt;/div&gt;&lt;div style="text-align: center;"&gt;mi ritrovai in una metropolitina scura&lt;/div&gt;&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-1311188902450233841?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/1311188902450233841/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/my-dantesque-travel-through-roma.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1311188902450233841'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/1311188902450233841'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/my-dantesque-travel-through-roma.html' title='my Dantesque travel through Roma Termini Metro Hell'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_C0PnWJwDRZY/S_uLD2DNm8I/AAAAAAAAAL8/4VRr3rpTtxY/s72-c/RomaTERMINImetroHELL-735238.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8841742297353733214</id><published>2010-05-24T06:35:00.002+02:00</published><updated>2010-09-30T21:27:29.522+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>kettle vs BODI - 'out of the box' performance comparison</title><content type='html'>I was cleaning up &lt;a href="http://www.pclaunches.com/entry_images/1107/27/dell_xps-m1530-1-thumb-450x313.jpg"&gt;my laptop&lt;/a&gt; this weekend and ran into a forgotten file with a quick performance comparison between &lt;a href="http://kettle.pentaho.org/"&gt;PDI&lt;/a&gt; and &lt;a href="http://www.sap.com/solutions/sapbusinessobjects/information-management/data-integration/dataintegrator/index.epx"&gt;BODI&lt;/a&gt; I did for a customer.&amp;nbsp; Now if I say "performance comparison", please don't think about a laboratory like test with fully documented results and full control over all variables. On the contrary, our approach to this performance comparison was extremely lean, for the simple reason that we were doing a PDI POC on functionality, not on performance. So the performance test was something for which we were allowed to take 2 hours time max.&lt;br /&gt;&lt;br /&gt;Anyway the set-up was the following:&lt;br /&gt;1) PDI and BODI installed on the same machine&lt;br /&gt;2) Reading/writing from/to the same database server&lt;br /&gt;3) Take 3 existing (simple) BODI jobs and convert them (without thinking) into PDI jobs&lt;br /&gt;&lt;br /&gt;I guess 1) and 2) don't need much comment. I guess running on the same machine makes the test results kind of comparable. If that doesn't what does. Also since we were reading/writing data from/to the same database server, I believe we kind of excluded network or io issues in the comparison. About point 3) I still want to have a quick word.&lt;br /&gt;&lt;br /&gt;We wanted to work on everyday simple jobs without spending time on them, because that is what a real world scenario looks like. Most ETL developers I know just grab the ETL tool and start bashing. Many of them don't really master all the tricks for performance tuning. So if you are looking for a tool that is performing well, I guess, what you mean is that you are looking for a tool that is performing well 'out of the box' or in a scenario where no product expert is invited to spend 3 days on fine-tuning your code and infrastructure. Depending on your needs, you might agree or not, but that was our &lt;a href="http://kettle.pentaho.org/"&gt;philosophy&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Although the executed code doesn't matter much, I still give a bit of background on what type of jobs we ran.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Job/Transformation 1: Read 20 mio rows, split the stream in 2, perform in each sort a stream on different fields, count the amount of resulting records from both stream and write the output (+/- 20 lines) to an output table. &lt;/li&gt;&lt;li&gt;&lt;/li&gt;&lt;li&gt;Job/Transformation 2: Read 20 mio rows, perform an in memory lookup for one of the colums to a table with approximately 10.000 rows and write the results to a table.&lt;/li&gt;&lt;li&gt;&lt;/li&gt;&lt;li&gt;Job/Transformation 3: Read 20 mio rows, denormalize them and write to disk&lt;/li&gt;&lt;/ul&gt;Anyway these were the results.&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;table border="1" style="margin-left: auto; margin-right: auto; text-align: left;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Transformation&lt;/td&gt;     &lt;td&gt;BODI (sec)&lt;/td&gt;     &lt;td&gt;PDI (sec)&lt;/td&gt;     &lt;td&gt;Difference&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transformation 1&lt;/td&gt;     &lt;td&gt;4260&lt;/td&gt;     &lt;td&gt;1501&lt;/td&gt;     &lt;td&gt;184% faster&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transformation 2&lt;/td&gt;     &lt;td&gt;1563&lt;/td&gt;     &lt;td&gt;1035&lt;/td&gt;     &lt;td&gt;51% faster&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transformation 3&lt;/td&gt;     &lt;td&gt;5048&lt;/td&gt;     &lt;td&gt;1054&lt;/td&gt;     &lt;td&gt;379% faster&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;Or in other words, even in the "worst case" PDI was 50% quicker than Business Objects Data Integrator. And that in an out of the box without any tuning scenario.&lt;br /&gt;&lt;br /&gt;Want more information: &lt;a href="http://www.kjube.be/"&gt;contact kJube&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8841742297353733214?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8841742297353733214/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/kettle-vs-bodi-out-of-box-performance.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8841742297353733214'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8841742297353733214'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/kettle-vs-bodi-out-of-box-performance.html' title='kettle vs BODI - &apos;out of the box&apos; performance comparison'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6938136991534537376</id><published>2010-05-22T11:41:00.005+02:00</published><updated>2010-09-30T21:37:53.747+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>Mass editing kettle transformations</title><content type='html'>Hi,&lt;br /&gt;&lt;br /&gt;Quickly wanted to share this. It ain't no rocket science, but it's pretty usefull.&lt;br /&gt;&lt;br /&gt;This morning, I quickly needed to point 50 transformations at a new server. Since I didn't parametrize the hostname of the server I found myself with 50 transformations containing this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/S_eRAb5zNrI/AAAAAAAAALk/oL_T1LtJ-4E/s1600/pic123.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="92" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/S_eRAb5zNrI/AAAAAAAAALk/oL_T1LtJ-4E/s320/pic123.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;rather than this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/S_eR3lB-XxI/AAAAAAAAAL0/-VaXz2sr7dk/s1600/pic125.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/S_eR3lB-XxI/AAAAAAAAAL0/-VaXz2sr7dk/s320/pic125.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;Luckily all kettle transformations are human readable XML formatted text files. No proprietary format like some commercial venders prefer, but plain text.&lt;br /&gt;&lt;br /&gt;So this little command solved my problems in milliseconds.&lt;br /&gt;&lt;br /&gt;&lt;div style="float: right; padding: 1em;"&gt;&lt;/div&gt;&lt;blockquote style="color: #4c1130;"&gt;&lt;code&gt;perl -pi -w -e 's/10\.89\.0\.191/\$\{HOSTNAME\}/g;' *.ktr&lt;/code&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;blockquote&gt;&lt;div style="color: #4c1130;"&gt;&lt;code&gt; -e means execute the following line of code.&lt;/code&gt;&lt;/div&gt;&lt;div style="color: #4c1130;"&gt;&lt;code&gt; -i means edit in-place&lt;/code&gt;&lt;/div&gt;&lt;div style="color: #4c1130;"&gt;&lt;code&gt; -w write warnings&lt;/code&gt;&lt;/div&gt;&lt;div style="color: #4c1130;"&gt;&lt;code&gt; -p loop&lt;/code&gt;&lt;/div&gt;&lt;div style="color: black;"&gt;&lt;code&gt; &lt;/code&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;/blockquote&gt;&lt;div style="color: black;"&gt;&lt;br /&gt;Next thing to do is just making sure that ${HOSTNAME} is added to the kettle.properties file and all worries are over.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;a href="http://www.liamdelahunty.com/tips/linux_search_and_replace_multiple_files.php"&gt;With thanks to ...&lt;/a&gt; &lt;/code&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6938136991534537376?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6938136991534537376/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/mass-editing-kettle-transformations.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6938136991534537376'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6938136991534537376'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/mass-editing-kettle-transformations.html' title='Mass editing kettle transformations'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_C0PnWJwDRZY/S_eRAb5zNrI/AAAAAAAAALk/oL_T1LtJ-4E/s72-c/pic123.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-2640153503180073567</id><published>2010-05-21T21:42:00.049+02:00</published><updated>2010-09-30T21:37:25.489+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>big blue is watching you</title><content type='html'>I often mix business and pleasure trips. This weekend I had to pass by my parents in law in Rome, but I took the opportunity to pass by a customer of us.&lt;br /&gt;&lt;br /&gt;Now this customer has asked us to migrate 300+ ETL jobs written in &lt;a href="http://en.wikipedia.org/wiki/IBM_InfoSphere_DataStage"&gt;IBM Datastage&lt;/a&gt; (the product IBM bought from Ascential, remember) to Pentaho Data Integration. In other words, the customer didn't want to pay a &lt;a href="https://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?P0=E1&amp;amp;part_number=D5982LL,D5988LL,D5987LL,D599ULL,D598CLL,D5975LL,D597FLL,D598ILL,D598PLL,D5978LL,D59AXLL,D597NLL,D59PDLL,D59PQLL,D59PSLL,D59PFLL,D59PULL,D59PELL,D03TVLL,D03TTLL,D03SPLL,D03SYLL,D03SGLL,D03U6LL,D03T8LL,D03SQLL,D03U5LL,D03P8LL,D03PFLL,D03PHLL,D03P9LL,D03PNLL,D03SELL,D03PDLL&amp;amp;catalogLocale=en_US&amp;amp;locale=en_US&amp;amp;country=USA&amp;amp;PT=html"&gt;&amp;gt;100k€ license cost&lt;/a&gt; if they could get the same (or more :-), if you ask me ) functionality for free (or at &lt;a href="http://kjube.blogspot.com/2010/03/stop-getting-burned.html"&gt;10% of the cost&lt;/a&gt; adding the &lt;a href="http://www.pentaho.com/"&gt;Pentaho&lt;/a&gt; support subscription) from open source ETL software.&lt;br /&gt;&lt;br /&gt;So I'm on a late plane flight, booked a hotel somewhere between the airport and our customer. What's the first thing I see when I enter the &lt;a href="http://www.marriott.com/hotels/travel/romau-rome-marriott-park-hotel/"&gt;Marriot hotel&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;div class="mobile-photo" style="text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/S_biSt-so4I/AAAAAAAAALU/xcaLaBaNS6k/s1600/BigBlue01-786704.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5473811208323507074" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/S_biSt-so4I/AAAAAAAAALU/xcaLaBaNS6k/s320/BigBlue01-786704.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: left;"&gt;Now can someone tell me what the chances are that some one who's one his way to discuss the suppression of IBM Datastage for a customer ends up in the middle of a &lt;a href="http://www-01.ibm.com/software/uk/data/conf/"&gt;BI convention of IBM&lt;/a&gt;, full of Datastage experts and nerds.&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="mobile-photo" style="text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/S_biS247fGI/AAAAAAAAALc/dNCzj5xA4PM/s1600/BigBlue02-787938.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5473811210715233378" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/S_biS247fGI/AAAAAAAAALc/dNCzj5xA4PM/s320/BigBlue02-787938.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;For a moment I thought they were on to us, I mean, really, what are the chances ?? But then again, we all know Big Blue isn't watching. They are using most of their revenue from BI on expensive sales conferences to keep those customers hooked.&lt;br /&gt;&lt;br /&gt;A pitty I arrived late and was tired, I might have gained a few customers :-)))&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-2640153503180073567?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/2640153503180073567/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/big-blue-is-watching-you.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2640153503180073567'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/2640153503180073567'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/big-blue-is-watching-you.html' title='big blue is watching you'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/S_biSt-so4I/AAAAAAAAALU/xcaLaBaNS6k/s72-c/BigBlue01-786704.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-4190867909179170660</id><published>2010-05-12T02:06:00.004+02:00</published><updated>2010-09-30T21:38:29.019+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>And that's why they call it open source</title><content type='html'>A friend asked me why, if he launched PDI (spoon), he could actually see the code running on his machine :-)&amp;nbsp;&amp;nbsp; If you perform a ps -aux on your machine will running spoon, indeed you see some impressive list of jar files that are being used.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/S-nwze9ritI/AAAAAAAAALM/-tKUvtPyBc0/s1600/pic112.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="513" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/S-nwze9ritI/AAAAAAAAALM/-tKUvtPyBc0/s640/pic112.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Not yet the full source code, but still ...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-4190867909179170660?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/4190867909179170660/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/and-thats-why-call-it-open-source.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/4190867909179170660'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/4190867909179170660'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/and-thats-why-call-it-open-source.html' title='And that&apos;s why they call it open source'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/S-nwze9ritI/AAAAAAAAALM/-tKUvtPyBc0/s72-c/pic112.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6565235127594187160</id><published>2010-05-11T23:04:00.002+02:00</published><updated>2010-09-30T21:34:46.413+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence'/><title type='text'>The pen - the ultimate report designer</title><content type='html'>What we IT guys with all our fancy tools and software tend to forget, is that sometimes the pen is mightier than the laptop. When you are meeting with a customer and trying to capture those requirements for a report or dashboard he wants to see, the very best tool to work with remains the pen.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/S33UCwS2kEI/AAAAAAAAAIo/gTPdhf-77fQ/s1600-h/onenote002.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="640" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/S33UCwS2kEI/AAAAAAAAAIo/gTPdhf-77fQ/s640/onenote002.jpg" width="491" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Of course, back in the time when I used to posses a nice Fujitsu tablet PC, most customers were extremely impressed when I was just drawing/drafting their dashboard on my tablet, hooked up to the beamer.&lt;br /&gt;&lt;br /&gt;But as I moved over to linux some years ago, I've left tablet experiences for what they are, back to paper and pencil. And believe me, it works just as well. &lt;br /&gt;&lt;br /&gt;So do yourself a favour. If you have need to an analysis to design some reports and dashboards for your customers, go out and make that investment in pencils.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/S33UZyFNWMI/AAAAAAAAAIw/Bbi8ScGQSVE/s1600-h/kleurdoos.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/S33UZyFNWMI/AAAAAAAAAIw/Bbi8ScGQSVE/s320/kleurdoos.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6565235127594187160?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6565235127594187160/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/05/pen-ultimate-report-designer.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6565235127594187160'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6565235127594187160'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/05/pen-ultimate-report-designer.html' title='The pen - the ultimate report designer'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_C0PnWJwDRZY/S33UCwS2kEI/AAAAAAAAAIo/gTPdhf-77fQ/s72-c/onenote002.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8592373258833508023</id><published>2010-04-15T23:09:00.004+02:00</published><updated>2010-09-30T21:38:57.042+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>.spoonrc</title><content type='html'>Yesterday I resized the left window pane of spoon in such a way I couldn't enlarge it anymore.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/S8TeGlp-meI/AAAAAAAAAK4/aUoGZALf8Pg/s1600/Picture+4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="238" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/S8TeGlp-meI/AAAAAAAAAK4/aUoGZALf8Pg/s400/Picture+4.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Extremely annoying. Adding a new step becomes an excruciating experience.&lt;br /&gt;&lt;br /&gt;But the solution was simple. Just remove .spoonrc in the .kettle directory and all your settings are back to basics.&lt;br /&gt;&lt;br /&gt;It proved to be a &lt;a href="http://jira.pentaho.com/browse/PDI-2149"&gt;know bug&lt;/a&gt; too.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8592373258833508023?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://jira.pentaho.com/browse/PDI-2149' title='.spoonrc'/><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8592373258833508023/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/04/spoonrc.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8592373258833508023'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8592373258833508023'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/04/spoonrc.html' title='.spoonrc'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/S8TeGlp-meI/AAAAAAAAAK4/aUoGZALf8Pg/s72-c/Picture+4.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6588052364379958082</id><published>2010-04-14T00:58:00.001+02:00</published><updated>2010-09-30T21:37:40.828+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>ORA-12155: TNS:received bad datatype in NSWMARKER packet</title><content type='html'>&lt;table align="center" border="0" cellpadding="5"&gt;&lt;tbody&gt;&lt;tr valign="top"&gt;&lt;td nowrap="nowrap"&gt;You just have to love Oracle error messages. Actually, they sometimes make me think of fortune coockies. You never know what you'll find.&lt;/td&gt;&lt;td nowrap="nowrap"&gt;&lt;br /&gt;&lt;/td&gt;&lt;td nowrap="nowrap"&gt;&lt;br /&gt;&lt;/td&gt;&lt;td nowrap="nowrap"&gt;&lt;br /&gt;&lt;/td&gt;&lt;td nowrap="nowrap"&gt;&lt;br /&gt;&lt;/td&gt;  &lt;td style="font-size: 16px;"&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr valign="top"&gt;  &lt;td style="font-size: 18px;"&gt;&lt;/td&gt;  &lt;td style="font-size: 16px; line-height: 1.5;"&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr valign="top"&gt;  &lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;blockquote&gt;&lt;blockquote&gt;ORA-12155: TNS:received bad datatype in NSWMARKER packet&lt;/blockquote&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;blockquote&gt;Cause: Internal error during break handling.&lt;/blockquote&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;blockquote&gt;Action: Not normally visible to the user. For further details, turn on tracing and reexecute the operation. If error persists, contact Worldwide Customer Support.&lt;/blockquote&gt;&lt;/blockquote&gt;&lt;table align="center" border="0" cellpadding="5"&gt;&lt;tbody&gt;&lt;tr valign="top"&gt;  &lt;td style="font-size: 16px; line-height: 1.5;"&gt;Mmmmmmmmmm&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6588052364379958082?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6588052364379958082/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/04/ora-12155-tnsreceived-bad-datatype-in.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6588052364379958082'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6588052364379958082'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/04/ora-12155-tnsreceived-bad-datatype-in.html' title='ORA-12155: TNS:received bad datatype in NSWMARKER packet'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-3346915310801548429</id><published>2010-04-13T22:20:00.000+02:00</published><updated>2010-04-13T22:20:37.328+02:00</updated><title type='text'>I'll miss the intgeration</title><content type='html'>PDI 4.0 is reaching GA status soon according to the posts on the Pentaho forums.&lt;br /&gt;&lt;blockquote&gt;Around the second or third week of April we'll release our release  candidate and the stable release of PDI is scheduled to hit the market  in mid-may.&lt;/blockquote&gt;That means I'll upgrade my machine and I'll have to miss the mythical 3.2.3 Intgeration splash screen that has been amusing me for the last few months.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/S8TR2ON5O_I/AAAAAAAAAKw/zyVFk1o_vjU/s1600/Picture+8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="278" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/S8TR2ON5O_I/AAAAAAAAAKw/zyVFk1o_vjU/s400/Picture+8.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Or should I post a jira to keep this feature?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-3346915310801548429?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://forums.pentaho.org/showthread.php?t=75371' title='I&apos;ll miss the intgeration'/><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/3346915310801548429/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/04/ill-miss-intgeration.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3346915310801548429'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3346915310801548429'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/04/ill-miss-intgeration.html' title='I&apos;ll miss the intgeration'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/S8TR2ON5O_I/AAAAAAAAAKw/zyVFk1o_vjU/s72-c/Picture+8.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-3044170683168937540</id><published>2010-04-06T01:10:00.002+02:00</published><updated>2010-09-30T21:39:11.121+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>Merge step - watch out for database sort</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;The merge step in kettle, provides a great functionality to identify new, changed, identical and deleted records when comparing 2 different tables. This allows you to split your incoming data in such a way that you can handle inserts, updates and (logical) deletes in the most efficient way as you can filter out all 'identical' records, and avoiding the extra round trips to the database an insert/update would create.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;I used the step in an ETL flow where I needed to compare between information from an Oracle and a DB2/400 database.The DB2/400 is the original data, while I keep a copy of that same data in our operational datastore on Oracle. So in order to find what has changed since my last batch load, I compare both tables, cross platform, using the merge step. &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;img border="0" height="259" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/S7pqqxD3oEI/AAAAAAAAAKY/R9hl9Tc_gt4/s640/Picture+3.png" width="640" /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;When testing the transformation however I noticed some funny behaviour. Notwithstanding the fact that both data flows mapped to the merge step were sorted on the same fields (a requirement for the merge step to work) still the refresh of the target table didn't run not correct.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;After some digging I figured out that both databases used different sort settings as show below. The DB2 on AS/400 sorted character fields differently than the Oracle database. Now off course I could have done the sort using which would have solved my sort problems, but then again sorting and aggregating is something better left to the database(s). They are just better at that.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;img border="0" height="400" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/S7pq2bXTbKI/AAAAAAAAAKg/wyoH_bmj1VA/s400/Picture+1.png" width="252" /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Still, it was a little 'bug' that took me some time to figure out. Once I figured out the issue, the solution was easy, I added the NLS_SORT parameter to the Oracle connection which set the sort settings on Oracle the same as on AS400 at least for the duration of that session. Enough to solve my problem.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/S7prC-R4MMI/AAAAAAAAAKo/scpbfPoFIF0/s1600/Picture+2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/S7prC-R4MMI/AAAAAAAAAKo/scpbfPoFIF0/s320/Picture+2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This allowed me to keep the sort on the database (more efficient than having the ETL tool solve that) and still have correct results. So for those of you who use this functionality cross database: check you sort results carefully before using the merge step !&lt;br /&gt;&lt;br /&gt;J&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-3044170683168937540?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://kettle.pentaho.org' title='Merge step - watch out for database sort'/><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/3044170683168937540/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/04/merge-step-watch-out-for-database-sort.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3044170683168937540'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/3044170683168937540'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/04/merge-step-watch-out-for-database-sort.html' title='Merge step - watch out for database sort'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/S7pqqxD3oEI/AAAAAAAAAKY/R9hl9Tc_gt4/s72-c/Picture+3.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7487839185056792682</id><published>2010-03-13T10:14:00.002+01:00</published><updated>2010-09-30T21:35:45.867+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>bye bye fly</title><content type='html'>&lt;div class="mobile-photo"&gt;&lt;a href="http://1.bp.blogspot.com/_C0PnWJwDRZY/S5tXo_6cWoI/AAAAAAAAAKQ/uYEHBYHZk6w/s1600-h/CustomerDIB%28016%29-715423.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5448044536097692290" src="http://1.bp.blogspot.com/_C0PnWJwDRZY/S5tXo_6cWoI/AAAAAAAAAKQ/uYEHBYHZk6w/s320/CustomerDIB%28016%29-715423.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;Finally plane commuting ends. After 4 months of flying up and down beween bxl and fco I'm on 'my last flight'. The flight's delayed of course, according to great brussels airline tradition but none is talking cancellation until now. &lt;br /&gt;I'm looking forward to being with my wife and daughter every day again. &lt;br /&gt;It was nice flying but trop is te veel en te veel is trop.&lt;br /&gt;J&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7487839185056792682?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7487839185056792682/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/03/bye-bye-fly.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7487839185056792682'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7487839185056792682'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/03/bye-bye-fly.html' title='bye bye fly'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_C0PnWJwDRZY/S5tXo_6cWoI/AAAAAAAAAKQ/uYEHBYHZk6w/s72-c/CustomerDIB%28016%29-715423.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6148389928682922488</id><published>2010-03-11T02:02:00.001+01:00</published><updated>2010-09-30T21:35:59.279+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>kubuntu kde globe wallpaper</title><content type='html'>Hi,&lt;br /&gt;&lt;br /&gt;While I was cleaning up the desktop of my laptop, I noticed (toying with the screensaver) that under wallpapers the option 'globe' had appeared.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/S5g__XLclGI/AAAAAAAAAKA/nqmE215jW5Y/s1600-h/pic78.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="301" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/S5g__XLclGI/AAAAAAAAAKA/nqmE215jW5Y/s400/pic78.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&amp;nbsp; &lt;br /&gt;&lt;br /&gt;After enabling it, I noticed that some Google Earth View like kind of map had appeared as my desktop wallpaper. Not just an image, but a 'live' view (or simulated live view) of the planet. Cool.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/S5hAcemTwsI/AAAAAAAAAKI/i5yYL-XYyxo/s1600-h/pic79.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/S5hAcemTwsI/AAAAAAAAAKI/i5yYL-XYyxo/s640/pic79.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;So now, while I'm hacking my way through the night, I can actually follow where on the planet it is night, and where it is daytime. Maybe this will help me to get back to a normal day rhythm?&lt;br /&gt;&lt;br /&gt;Anyway, it is tremendously pleasant to stumble upon such nice features in kde. I know it's eye candy. But it's very well done, for free and 'just there', without need for installing. I'm not entirely sure how easy a Windows 7 user can get something similar. But then again, I haven't touched Windows for some time :-)&lt;br /&gt;&lt;br /&gt;J&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6148389928682922488?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://www.kde.org/' title='kubuntu kde globe wallpaper'/><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/6148389928682922488/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/03/kubuntu-kde-globe-wallpaper.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6148389928682922488'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/6148389928682922488'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/03/kubuntu-kde-globe-wallpaper.html' title='kubuntu kde globe wallpaper'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/S5g__XLclGI/AAAAAAAAAKA/nqmE215jW5Y/s72-c/pic78.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7675820219795269155</id><published>2010-03-10T23:08:00.006+01:00</published><updated>2010-09-30T21:36:34.832+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence - Pentaho'/><title type='text'>Stop getting burned</title><content type='html'>I guess this add have been out there&amp;nbsp;for some time&amp;nbsp;already, but still, for those who didn't see it yet, and want to have a laugh, check out &lt;a href="http://www.stopgettingburned.com/"&gt;Pentaho's pop-art marketing&lt;/a&gt; on&amp;nbsp;commercial BI vendors. Those guys at Pentaho must have a lot of fun&amp;nbsp; :-)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/S5ULOFCDL8I/AAAAAAAAAJo/rr2aX1Spybc/s1600-h/stop.getting.burned.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="640" kt="true" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/S5ULOFCDL8I/AAAAAAAAAJo/rr2aX1Spybc/s640/stop.getting.burned.PNG" width="596" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;J&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7675820219795269155?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://www.stopgettingburned.com/' title='Stop getting burned'/><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7675820219795269155/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/03/stop-getting-burned.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7675820219795269155'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7675820219795269155'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/03/stop-getting-burned.html' title='Stop getting burned'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/S5ULOFCDL8I/AAAAAAAAAJo/rr2aX1Spybc/s72-c/stop.getting.burned.PNG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8242707134179437578</id><published>2010-03-10T12:41:00.003+01:00</published><updated>2010-09-30T21:36:16.995+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun and fail'/><title type='text'>sudo to the rescue</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_C0PnWJwDRZY/S5eFOdX4quI/AAAAAAAAAJ4/9y8QkX0PwUU/s1600-h/sandwich.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://3.bp.blogspot.com/_C0PnWJwDRZY/S5eFOdX4quI/AAAAAAAAAJ4/9y8QkX0PwUU/s400/sandwich.png" vt="true" width="373" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8242707134179437578?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8242707134179437578/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/03/sudo-to-rescue.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8242707134179437578'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8242707134179437578'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/03/sudo-to-rescue.html' title='sudo to the rescue'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_C0PnWJwDRZY/S5eFOdX4quI/AAAAAAAAAJ4/9y8QkX0PwUU/s72-c/sandwich.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-5894387575238660741</id><published>2010-03-08T15:26:00.005+01:00</published><updated>2010-09-30T21:40:13.728+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>user defined command line parameters in the kitchen</title><content type='html'>Hello,&lt;br /&gt;&lt;br /&gt;Just recently I stumbled upon a way to pass custom command line parameters to PDI jobs that are executed through kitchen. Somehow, this isn't documented &lt;a href="http://wiki.pentaho.com/display/EAI/Kitchen+User+Documentation"&gt;in the user documentation&amp;nbsp;&lt;/a&gt;, so I thought a few lines were in place.&lt;br /&gt;&lt;br /&gt;According to the Pentaho documenation, kitchen supports the following&amp;nbsp;series of command line parameters:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;file: xml file (job) to execute&lt;/li&gt;&lt;li&gt;log: log file&lt;/li&gt;&lt;li&gt;level: logging level (Nothing, Error, Minimal, Basic, ... Rowlevel)&lt;/li&gt;&lt;li&gt;rep: repository name&lt;/li&gt;&lt;li&gt;user: user to log in to repository&lt;/li&gt;&lt;li&gt;pass: password to log in to repository&lt;/li&gt;&lt;li&gt;listdir: list directories in repository&lt;/li&gt;&lt;li&gt;listjobs: list repository jobs&lt;/li&gt;&lt;li&gt;listrep: list available repositories&lt;/li&gt;&lt;li&gt;norep: don't log into repository&lt;/li&gt;&lt;li&gt;dir: set repository directory&lt;/li&gt;&lt;li&gt;job: repository job to run&lt;/li&gt;&lt;li&gt;export: file to export a job/trf too&amp;nbsp;(including&amp;nbsp;all related jobs/trf)&lt;/li&gt;&lt;/ul&gt;These are however all standard parameters. If you want to pass your own parameter into the job you are launching, I&amp;nbsp;thought I was down to postional arguments, however that seemed not to be true.&lt;br /&gt;&lt;br /&gt;An&amp;nbsp;undocumented parameter -&amp;nbsp;at least it wasn't&amp;nbsp;in any user documentation&amp;nbsp;I was capable of finding -&amp;nbsp;is the parameter called param, used to pass named parameters.&amp;nbsp;E.g. you&amp;nbsp;could&amp;nbsp;specify the following when executing a job using kitchen:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;sh kitchen.sh -file=/daily_run.kjb&amp;nbsp;-level=Basic &lt;b&gt;&lt;i&gt;-param:AFFILIATE=affiliate01&lt;/i&gt;&lt;/b&gt; -logfile=/daily_run.log&lt;/span&gt;&lt;/blockquote&gt;&lt;br /&gt;This would pass the value "affiliate01" to the variable AFFILIATE in the job "daily_run.kjb". For this to work the job needs to have the input parameter defined in the parameters tab of the job properties.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/S5VWlxs1ZxI/AAAAAAAAAJw/UJ2pwTH6bUM/s1600-h/pic73.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="191" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/S5VWlxs1ZxI/AAAAAAAAAJw/UJ2pwTH6bUM/s400/pic73.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Simple as that.&lt;br /&gt;&lt;br /&gt;Cheers,&lt;br /&gt;&lt;br /&gt;J.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Some links that could have brought you there:&lt;br /&gt;- &lt;a href="http://wiki.pentaho.com/display/EAI/Kitchen+User+Documentation"&gt;Kitchen user documentation&lt;/a&gt;&lt;br /&gt;- Pentaho forum: "&lt;a href="http://forums.pentaho.org/showthread.php?t=58147"&gt;pass env variables into job via kitchen from cmd line&lt;/a&gt;"&amp;nbsp; &lt;br /&gt;- &lt;a href="http://www.ibridge.be/?p=159"&gt;Matt's blog&lt;/a&gt;&amp;nbsp;(--&amp;gt; where I found the mustard :-)&amp;nbsp; )&lt;br /&gt;- &lt;a href="http://jira.pentaho.com/browse/PDI-1381"&gt;Jira&lt;/a&gt; by Sven Boden who actually asked for this feature&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-5894387575238660741?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/5894387575238660741/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/03/user-defined-command-line-parameters-in.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5894387575238660741'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/5894387575238660741'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/03/user-defined-command-line-parameters-in.html' title='user defined command line parameters in the kitchen'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_C0PnWJwDRZY/S5VWlxs1ZxI/AAAAAAAAAJw/UJ2pwTH6bUM/s72-c/pic73.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-8630701060664973255</id><published>2010-03-07T12:50:00.004+01:00</published><updated>2010-09-30T21:35:01.892+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence'/><title type='text'>333 rule to keep your BI apps in check</title><content type='html'>I really loved this post of &lt;span class="storyByline"&gt;&lt;a href="http://www.information-management.com/authors/2000329.html"&gt;Boris Evelson&lt;/a&gt;&lt;/span&gt; on Information Management Blogs, March 3, 2010.&lt;br /&gt;&lt;br /&gt;The essence of the 333 rule: &lt;br /&gt;&lt;ul&gt;&lt;li&gt;Whenever there is a request for a new report that requires new data source or additional data sets from existing sources, first create manual extracts, pull them into text files and create "reports" on top of these extracts&lt;/li&gt;&lt;li&gt;If, and only If, after &lt;b&gt;3 weeks&lt;/b&gt;, the new reports are still being actively used, they then point the reports directly to the operational data sources.&lt;/li&gt;&lt;li&gt;If, and only If, after &lt;b&gt;3 months&lt;/b&gt;, the reports are still being actively used, they then create new DW tables and populate them with the new data sets.&lt;/li&gt;&lt;li&gt;If, and only If, after &lt;b&gt;3 quarters&lt;/b&gt;, the reports are still being actively used, they then completely productionalize&amp;nbsp; the process with QA, UAT, DR and other production controls, risk management, and operational procedures.&lt;/li&gt;&lt;/ul&gt;&amp;nbsp;Although this is a great rule of thumb for assuring that your BI environment doesn't get clogged up with junk, the rule kind of creates a difficulty in budget management. Of course you need to explain your business users that the "cheap" report you created for them initially, based on a simple extract, will need to be "rewritten" 3 times. That is they will have to pay several times over for the exact same functionality. Not impossible, but you'd better take that communication upfront. And you'll have to do some user "education" to explain them the why and how. &lt;br /&gt;&lt;br /&gt;Cheers,&lt;br /&gt;&lt;br /&gt;J&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-8630701060664973255?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://blogs.forrester.com/business_process/2010/03/333-rule-to-keep-your-bi-apps-in-check.html' title='333 rule to keep your BI apps in check'/><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/8630701060664973255/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/03/333-rule-to-keep-your-bi-apps-in-check.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8630701060664973255'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/8630701060664973255'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/03/333-rule-to-keep-your-bi-apps-in-check.html' title='333 rule to keep your BI apps in check'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-7983538835543141078</id><published>2010-02-25T12:34:00.002+01:00</published><updated>2010-09-30T21:35:15.308+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence'/><title type='text'>The Excel Piramid</title><content type='html'>Old post &lt;a href="http://www.kjube.be/tnenopxe/index.php?action=view&amp;amp;id=1&amp;amp;module=weblogmodule&amp;amp;src=%40random41940a4020842"&gt;blog post&lt;/a&gt; from 12/11/04, as it was on our website. I don't know if that blog will continue to exist, so I've posted a copy here. Unfortunately, 5 years later, I still keep on encountering Excel construction workers on a regular basis. It just won't go away.&lt;br /&gt;&lt;br /&gt;Anyhow, here's the post.&lt;br /&gt;&lt;br /&gt;"To my utmost frustration I keep on encountering them at several client sides. Excel pyramids. Just like their Egyptian counterparts it takes years to make them, it remains a mystery how exacly they have been built, they are built for the glory of the top-man and finally that same top-man gets burried under it (or at least that's what he deserves)!&lt;br /&gt;&lt;br /&gt;To support continuous demand for reports, companies often turn to Excel. Obvious, as it remains the most the most available tool. But as time passes by more requests come in, time presses and one Excel sheets gets built on top of another. Chains of Excel reporting spring to life. One linked to the other, (ab)using the vlookup function to fetch data from on sheet to the other, copy pasting linked data from one sheet to the other until a highly complex spiderweb of linked Excel sheets has been built that none dare to touch.&lt;br /&gt;&lt;br /&gt;The question I ask myself is why nobody pulls the emergency break in time. I've seen situations where for years (&amp;gt; 10 years) companies built one Excel on top of the other, developping monthly "procedures" to accurately update the collection of sheets that have grown to gigabytes of data each month, copying that "reporting-database-folder" over and over again.&lt;br /&gt;&lt;br /&gt;The update procedures become more cumbersome every month, until in the end the company finds itself with a department of 20 people purely dedicated to relinking and recalculation Excel sheets.&lt;br /&gt;&lt;br /&gt;It's nice to see when finally they see the light shining on the outside of the piramid and realize they need to change. But finding your way through the maze inside the pyramid, to actually understand how you get from chaos to an automated solution is far from evident. And as usual customer that are deepest in the sh** are the ones most eager to see results.&lt;br /&gt;&lt;br /&gt;J&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-7983538835543141078?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kjube.blogspot.com/feeds/7983538835543141078/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kjube.blogspot.com/2010/02/excel-piramid.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7983538835543141078'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22251486/posts/default/7983538835543141078'/><link rel='alternate' type='text/html' href='http://kjube.blogspot.com/2010/02/excel-piramid.html' title='The Excel Piramid'/><author><name>Jan Aertsen</name><uri>http://www.blogger.com/profile/17468629673353931466</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://3.bp.blogspot.com/_C0PnWJwDRZY/S2SrlqE-BSI/AAAAAAAAAGs/nwdFmg_oLtY/S220/jan_aertsen_foto_klein.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22251486.post-6376693423311073037</id><published>2010-02-22T12:54:00.001+01:00</published><updated>2010-09-30T21:26:27.798+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Integration - Kettle'/><title type='text'>Parallell ETL execution</title><content type='html'>In &lt;a href="http://kettle.pentaho.org/"&gt;Pentaho Data Integration&lt;/a&gt; it is extremely easy to execute jobs in parallel. The job below is an example of 7 jobs launched together right after the step which 'starts' the job. &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/S4JeAZl-HKI/AAAAAAAAAJA/UdwtBwy3kKI/s1600-h/pic28.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="500" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/S4JeAZl-HKI/AAAAAAAAAJA/UdwtBwy3kKI/s640/pic28.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;It's very easy to configure. (You can listen to some &lt;a href="http://www.melodygardot.com/"&gt;Melody Gardot&lt;/a&gt; while doing it, as you can see !) . Just right click on the step previous to the parallel jobs, and select ''Launch next entries in parallel".&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_C0PnWJwDRZY/S4JfhC7JTzI/AAAAAAAAAJI/jisnMn47zzo/s1600-h/pic29.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_C0PnWJwDRZY/S4JfhC7JTzI/AAAAAAAAAJI/jisnMn47zzo/s320/pic29.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;So kettle has made life easy for us again (as of version 3.0 if I recall correctly). However it doesn't end here. You want to run jobs in parallel in order to speed up your whole ETL process. In order to achieve the best possible results, there are a few things to study and consider.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Balance out short and long running jobs to get a good spread of system load &lt;/b&gt;&lt;br /&gt;Your ETL batch process will contain some jobs that - running standalone - would last a very long time, while others will be relatively short.&amp;nbsp; Putting the long running jobs in parallel seems like a better choice than putting the 20 very short running jobs in parallel. The latter will probably&amp;nbsp; just create a 5 minute peak load on your system while after that 5 minute peak you still need to wait for the longest running job. So spreading jobs intelligently over the time span of the longest running job (or chain of jobs) seems like the way to go.&lt;br /&gt;&lt;br /&gt;So rather than just launching all you've got in parallel without thinking, a good set-up would be something like the below. Obviously this means that you have an approximate knowledge of how long each job would last if run by itself. Make sure you get those statistics.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_C0PnWJwDRZY/S4Ji1D9PO8I/AAAAAAAAAJQ/2zZ9HgjNsVA/s1600-h/pic30.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="242" src="http://4.bp.blogspot.com/_C0PnWJwDRZY/S4Ji1D9PO8I/AAAAAAAAAJQ/2zZ9HgjNsVA/s400/pic30.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Next is to try. Trial and error is part of the game. Some configurations will work better than others. So playing around a bit is unfortunately part of the game. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Funtional dependencies&lt;/b&gt;&lt;br /&gt;Needless to say, if jobs depend on each other, it's better not to parallelize them.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Competing for resources&lt;/b&gt;&lt;br /&gt;Similarly, jobs that are competing for the same resources - not CPU - e.g. that might require reading (or updating/inserting) the same tables, are best schedule one after the other, as they might drastically reduce performance.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;As time goes by&lt;/b&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Heraclitus"&gt;Panta rei&lt;/a&gt;. Everything changes all the time. Therefore what is today your longest running job, may be be just an average job tomorrow. And a well performing job might become the worst pupil in the class. Data volumes and complexity of the processing will change over time. This will affect your parallelization. So re-evaluate your set-up from time to time.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;In other words, scheduling jobs in parallel isn't really a technical issue. To start with, it all drills down to knowing the performance of your single jobs and have a good functional understanding of your code.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Remark: Obviously things can get more technical. Once you get to the point where you have one single job that still takes too much time, you might need to spread that job over different machines. That's a different kind of parallelization than what I describe above. How to do this has been &lt;a href="http://www.ibridge.be/?s=cluster&amp;amp;submit=Go"&gt;described already&lt;/a&gt;.&lt;br /&gt;&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22251486-6376693423311073037?l=kjube.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replie
