Category Archives: ETL

Getting Started with Unstructured Data

From TDWI Article -

To start down this path, you will obviously need to take a more holistic view of your organization’s information and technology architecture to learn what data is available to your end users. You also need to spend time learning what is missing today from the BI environment. Don’t be surprised if people at first cannot articulate their needs in this arena — most people do not believe current tools can support this analysis!

In conjunction with this internal fact-finding, stay abreast of the evolution of “unstructured” content software and service solutions. Although these concepts have been around for some time, some technological developments have emerged only recently to allow some of the more interesting analysis and integration opportunities in this area.

Finally, keep experimenting! The BI market has grown and matured substantially in the last several years, and this is an exciting new area where we can all stretch and investigate. As famous engineer Richard Buckminster Fuller once quipped, “There is no such thing as a failed experiment — only experiments with unexpected outcomes.”

EntropySoft announces Content ETL 2

Content ETL is designed to structure and industrialize the transfers and transformations between all content-repositories. Its features ensure full traceability of all movements.

Content ETL offers:
- connection to all the company’s content repositories
- graphic conception and planning of all document transfers
- documents and metadata transformation management
- delegation of transfer management to end-users

The software is using EntropySoft’s exclusive portfolio of bidirectional connectors to access the information system. Content ETL’s easy-to-use interface helps putting into end-users’ hands the planning and execution of the company’s document processes.

Inside Content ETL, two clients, Content ETL Studio and Content ETL Web, are implementing the software’s features. Content ETL Studio is for advanced users who want to graphically design and plan document processes. Content ETL Web is for all users to manage transfers on a day-to-day basis.

- From Press Release.

Unstructured data is pretty shabby(oxymoron). Very recently the ETL vendors like Informatica have started to take this into consideration. And here’s another one from EntropySoft.

A Brief History of ETL – Bill Inmon

It’s Bill Inmon, the father of data warehousing, talking about the history ETL. Must Read. An article with lots of vintage value.

On March 15, 1995, Prism Solutions went public on the NASDAQ exchange (PRSM). And then, there were even more competitors that entered the market space. One entrant was Ab Initio. Ab Initio specialized in the movement of very large amounts of data.

In a related space was enterprise application integration (EAI). EAI has many of the capabilities of a data warehouse extract, transform and load (ETL) tool when it comes to moving data about the corporation. However, EAI falls short when it comes to handling transformation and metadata management. Nevertheless, there is some degree of overlap between the worlds of ETL and EAI.

Informatica On-Demand for Salesforce

Informatica Corporation (Nasdaq: INFA), the leading independent provider of
data integration software, today announced the Informatica On Demand (IOD)
Data Loader Service for Salesforce, the newest addition to its Software as
a Service (SaaS) offerings. The IOD Data Loader Service is a bi-directional
integration offering that allows Salesforce administrators to automate many
Salesforce integration processes such as synchronizing account information
with other applications, creating back office orders from closed
opportunities, and loading leads. In addition to eliminating manual coding
efforts, the IOD Data Loader is entirely web-based, and removes the need
for on-premise software or hardware appliances. The announcement was made
today at Dreamforce Europe’s User and Developer Conference.

This new offering is simple and easy-to-use, allowing customers to
increase their operational efficiencies by automating many time consuming
and error prone integration tasks. The IOD Data Loader extends the
capabilities of’s Dataloader by:

— Adding an intuitive web based integration wizard
— Automating the scheduling of integration jobs
— Providing direct access to relational databases
— Transforming data through a drag and drop web interface
— Importing and exporting Salesforce Dataloader maps

More from the Press Release.

Informatica Reports Record First Quarter Results

Revenues for the first quarter of 2008 were $103.7 million, up 19 percent from the $87.1 million recorded in the first quarter of 2007. License revenues for the first quarter were $44.2 million, up 18 percent from the $37.6 million recorded in the first quarter of 2007. Net income for the first quarter, calculated in accordance with U.S. generally accepted accounting principles (GAAP), was $11.2 million or $0.12 per diluted share, up more than 20 percent from net income of $9.1 million or $0.10 per diluted share in the first quarter of 2007.

Significant milestones achieved since January 2008 include:

Announced definitive agreement to acquire Identity Systems

Signed repeat business with 178 customers.

Added 38 new customers.

Launched the Informatica Data Migration Suite.

Wipro Technologies Selected Informatica for Data Migration Service.

Launched the INFORM Partner Network.

From the Press Release.

Microsoft & Unisys make an ETL World Record

Microsoft & Unisys ETL Architecture

Today at the launch of SQL Server 2008, you may have seen the references to world-record performance doing a load of data using SSIS. Microsoft and Unisys announced a record for loading data into a relational database using an Extract, Transform and Load (ETL) tool. Over 1 TB of TPC-H data was loaded in under 30 minutes.

Way to go, MS. For more background on this, head over to Microsoft Blog.

LogiXML Adds ETL Tool To BI Platform

From Intelligent Enterprise -

Logi ETL is a Web-based data integration application that can transfer data from diverse sources directly to multiple destinations, be invoked by predefined processes, including scheduling; or be triggered by application actions, the company said.

“The new Logi ETL product is comprehensive and rich in functionality, easy-to-use and integrate,” Arman Eshraghi, chief executive and founder of LogiXML, said in a statement. “The product embraces XML technologies and is operationally more Web-oriented than any other ETL product in the market.”

Extract, Transform and Load for Data Warehousing

How else can you title the post, when the father of Data Warehouse himself, Bill Inmon, writes a nostalgic note on ETL.

When data warehousing and ETL first appeared on the scene, the coders of the world felt threatened. In case after case, the coders of the world went and found their most complicated, most arcane, most convoluted program that was needed for transformation and threw that program at the ETL vendor. When the ETL technology threw up, the programmers said – “See, we can’t use ETL here – it can’t handle the XYZ program.”

Fast forward to today, and there is ETL everywhere. It is found in almost every shop that has a data warehouse. No one thinks twice about bringing in ETL technology. The drudgery of writing and maintaining transformation code has mercifully been shifted to automation.

What happened when programmers were protecting their own turf? The programmers were selecting the most impossible example of code and transformation to use as a basis for selecting or not selecting ETL processing. Behind the one difficult program are oodles of easy and much more normal transformations. Selecting the most difficult program in the company as a basis for transformations is like selecting Yao Ming as a representative of Chinese people. Yao Ming plays in the NBA and is 7’5″ tall. To draw the conclusion that most Chinese are 7’5″ tall is to make a serious error in judgment because even though Yao Ming is Chinese and is tall, not all Chinese are nearly that tall. For that matter, neither are the Americans, the Russians, the English or anyone else. But that is exactly what the early programmers did in order to keep ETL out of the shop.