Extract, Transform and Load for Data Warehousing  

How else can you title the post, when the father of Data Warehouse himself, Bill Inmon, writes a nostalgic note on ETL.

When data warehousing and ETL first appeared on the scene, the coders of the world felt threatened. In case after case, the coders of the world went and found their most complicated, most arcane, most convoluted program that was needed for transformation and threw that program at the ETL vendor. When the ETL technology threw up, the programmers said – “See, we can’t use ETL here – it can’t handle the XYZ program.”

Fast forward to today, and there is ETL everywhere. It is found in almost every shop that has a data warehouse. No one thinks twice about bringing in ETL technology. The drudgery of writing and maintaining transformation code has mercifully been shifted to automation.

What happened when programmers were protecting their own turf? The programmers were selecting the most impossible example of code and transformation to use as a basis for selecting or not selecting ETL processing. Behind the one difficult program are oodles of easy and much more normal transformations. Selecting the most difficult program in the company as a basis for transformations is like selecting Yao Ming as a representative of Chinese people. Yao Ming plays in the NBA and is 7’5″ tall. To draw the conclusion that most Chinese are 7’5″ tall is to make a serious error in judgment because even though Yao Ming is Chinese and is tall, not all Chinese are nearly that tall. For that matter, neither are the Americans, the Russians, the English or anyone else. But that is exactly what the early programmers did in order to keep ETL out of the shop.

Written by Guru Kirthigavasan

February 5th, 2008 at 8:02 am

