KETTLE 4.2.0 stable is out

Download on sourceforge...

More informations from pentaho kettle's forum :

Here are some of the new things in this version:

The Excel Writer step offers advanced Excel output functionality to control the look and feel of your spreadsheets.
Graphical performance and progress feedback for transformations
The Google Analytics step allows download of statistics from your Google analytics account
The Pentaho Reporting Output step makes it possible for you to run your (parameterized) Pentaho reports in a transformation. It allows for easy report bursting of personalized reports.
The Automatic Documentation step generates (simple) documentation of your transformations and jobs using the Pentaho Reporting API.
The Get repository names step retrieves job and transformation information from your repositories.
The LDAP Writer step
The Ingres VectorWise (streaming) bulk loader step
The Greenplumb (streaming) bulk loader step (for gpload)
The Talend Job Execution job entry
Healthcare Level 7 : HL7 Input step, HL7 MLLP Input and HL7 MLLP Acknowledge job entries
The PGP File Encryption, Decryption & validation job entries facilitate encryption and decryption of files using PGP.
The Single Threader step for parallel performance tuning of large transformations
Allow a job to be started at a job entry of your choice (continue after fixing an error)
The MongoDB Input step (including authentication)
The ElasticSearch bulk loader
The XML Input Stream (StAX) step to read huge XML files at optimal performance and flat memory usage by flattening the structure of the data.
The Get ID from Slave Server step allows multi-host or clustered transformations to get globally unique integer IDs from a slave server: http://wiki.pentaho.com/display/EAI/...m+Slave+Server
Carte improvements:

reserve next value range from a slave sequence service
allow parallel (simultaneous) runs of clustered transformations
list (reserved and free) socket reservations service
new options in XML for configuring slave sequences
allow time-out of stale objects using environment variable KETTLE_CARTE_OBJECT_TIMEOUT_MINUTES
Memory tuning of logging back-end with: KETTLE_MAX_LOGGING_REGISTRY_SIZE, KETTLE_MAX_JOB_ENTRIES_LOGGED, KETTLE_MAX_JOB_TRACKER_SIZE allowing for flat memory usage for never ending ETL in general and jobs specifically.

Repository Import/Export

Export at the repository folder level
Export and Import with optional rule-based validations
Import command line utility allow for rule-based (optional) import of lists of transformations, jobs and repository export files: http://wiki.pentaho.com/display/EAI/...+Documentation

ETL Metadata Injection:

Retrieval of rows of data from a step to the “metadata injection” step
Support for injection into the “Excel Input” step
Support for injection into the “Row normaliser” step
Support for injection into the “Row Denormaliser” step

The Multiway Merge Join step (experimental) allows for any number of data sources to be joined using one or more keys using an inner or a full outer join algorithm.

I'd like the Talend Job Execution job entry very much... it's so funny to use Talend inside of Kettle ;-)

BI² - Still another blog about BI ...

Pages

Tuesday, September 13, 2011

KETTLE 4.2.0 stable is out

No comments:

Post a Comment

Who am I ?

Tags Cloud

Blog Archive

Blogs I'm reading

News from Decideo.fr

Total Pageviews

Followers

BI² - Still another blog about BI ...

Pages

Tuesday, September 13, 2011

KETTLE 4.2.0 stable is out

No comments:

Post a Comment

Who am I ?

Tags Cloud

Blog Archive

Blogs I'm reading

News from Decideo.fr

Total Pageviews

Followers

Subscribe To