The Excel Writer step
Some time ago Slawomir Chodnicki (aka Slawo) released the Excel Writer Step version 1.2 for Pentaho Data Integration. Not only did he deliver a wonderful piece of extra PDI functionality, but he also accompanied its release with a great blogpost as well as a sheer endless list of samples.
Since the Excel Writer is donation ware, and Slawo is a big kettle contributor, it should come a no surprise that the Excel Writer will be standard included in kettle 4.2. A great contribution, which have been eager to use in a real life situation.
The use case
Last week, I used the Excel Writer for an assignment. The reason why customer wanted the results in Excel were simple. They needed to manipulate both the data as well as the look and feel of the graphs, before using them in Word documents and or Powerpoint documents. Some questions weren't even needed in the final report. So a lot of editing was needed after the creation of the initial report.
The basic requirements were the following:
- Generate Excel reports from the responses to (multiple choice) questionnaires.
- Create one Excel sheet with the questionaire, and one Sheet per question showing the answers
- Each Excel report with answers should contain both the figures in a table as well as in a graph.
- Each questionnaire can consist of somewhere between 20 to a 100 questions.
- Each question can have somewhere between 2 (yes/no) to 10 answers.
The basic code was pretty simple.
- One transformation to fetch all the questions of the questionnaire and write them to an Excel Sheet.
- One transformation to write all the answers to a sheet, to be executed per question
- + some logic to clean up the previous version and zip the new version
The output - one Excel file?
My first idea was to generate one Excel book, containing 1 sheet with an overview of all the questions in the questionnaire, and next per question 1 sheet with the graphs/tables on the answer. It was based upon two sheet templates and the result was supposed to look like this.
However when I tried this I ran into the strange issue that the graphs on my Excel template disappeared. I took me some time to figure out that this was a limitation of the Excel writer. A limitation that was even documented, but unfortunately, I don't always read tool tips.
Anyhow, Slawo confirmed me that the POI library that is used, has some limitations:
The output - many Excel files