The Business Intelligence Blog

Slicing Business Dicing Intelligence

Archive for the ‘Data Warehousing’ tag

Vertica Moves BI Database to Amazon’s Cloud  

Pay-as-you-go in the Data Warehousing and Business Intelligence space is getting some more traction as Vertica moves its technology to Amzon’s Elastic Compute Cloud infrastructure computing infra-structure.

Vertica’s database organizes data by columns, as opposed to rows. The company and others that make columnar databases, such as Sybase and ParAccel, contend the approach is faster and better for BI-related queries because only the desired columns — such as a customer’s name or location — can be read without having to parse through an entire table, saving bandwidth.

The company also sells the database for on-premises use and in appliance form.

It sees a market for the on-demand offering due to a number of scenarios. For example, a company might want to conduct a BI project that will only last a fixed period of time, such as revising its pricing based on competitive and market data, said Andy Ellicott, senior director of marketing.

Hedge funds, which test their trading algorithms against large sets of historical stock market data, are another potential use case, because while such entities manage vast amounts of money, they seek to maintain the lowest possible overhead, which a cloud-based approach can provide, he said.

The article has

no responses yet

Written by Guru Kirthigavasan

May 11th, 2008 at 9:46 am

A Brief History of ETL – Bill Inmon  

It’s Bill Inmon, the father of data warehousing, talking about the history ETL. Must Read. An article with lots of vintage value.

On March 15, 1995, Prism Solutions went public on the NASDAQ exchange (PRSM). And then, there were even more competitors that entered the market space. One entrant was Ab Initio. Ab Initio specialized in the movement of very large amounts of data.

In a related space was enterprise application integration (EAI). EAI has many of the capabilities of a data warehouse extract, transform and load (ETL) tool when it comes to moving data about the corporation. However, EAI falls short when it comes to handling transformation and metadata management. Nevertheless, there is some degree of overlap between the worlds of ETL and EAI.

The article has

3 responses

Written by Guru Kirthigavasan

May 8th, 2008 at 6:51 pm

Posted in ETL

Tagged with , , ,

Are you the Best-in-class company on “Time-to-Information” ?  

That’s really a great news to the BI world during these dull days of business. I’m sure other companies will start becoming confident on the never understood phenomenon of ROI. Makes a day !!

According to research presented in a new report, “Data Management for Business Intelligence,” 77% of Best-in-Class companies are able to automate the integration of data from multiple sources, compared to 54% of Industry Average companies, and only 22% of Laggards. This capability was identified by respondents as being critical for solving the top business pressure — the need to reduce the time-to-information for non-technical end-users.

Best-in-Class companies are making investments in technology enablers to alleviate this pressure. 82% of Best-in-Class companies are currently utilizing data warehousing software solutions, versus 56% of all other respondents. 80% of Best-in-Class companies are currently deploying Business Intelligence query and reporting tools, versus 47% of all other respondents. Finally, 68% of Best-in-Class companies are currently implementing data warehouse appliance technology (packaged hardware and software solutions) versus only half that number (34%) of all other respondents.

The article has

one response

Written by Guru Kirthigavasan

April 1st, 2008 at 9:51 am

@ctive Data Warehousing aka Closed Loop Processing  

Active Data Warehousing isn’t really a buzzword. Its been in the industry for a while. Thanks to Teradata who made this buzzword popular. They called it @ctive Data Warehousing and branded the spelling.

The reason to bring back this term is because -

Teradata Corporation (NYSE: TDC), the global leader in enterprise data warehousing, announced today that Highmark Inc., the largest health insurer in Pennsylvania and one of the largest in the U.S., continues to expand its Teradata Warehouse and its multiple-terabyte information assets. The large-scale analytics expansion increases the company’s production environment and supports the shift to active data warehousing (ADW).

For newbies, here’s more about ACtive Data Warehousing from DM Review of April 2004(yep, its 4 years old enough) -

Active data warehousing is a process, not a specific technology. Teradata has popularized the term “active data warehousing,” tried to brand the spelling “@ctive data warehousing” and deserves credit for providing examples of some big, successful active data warehouses. However, if a more generic term is preferred, then “closed-loop processing” is a useful synonym, though it only partially captures the concept. Your data warehouse (DW) is active if

It represents a single, canonical state of the business (version of the truth). Too often, companies put data into a data warehouse and also store it in a plethora of other data stores. If a data warehouse must be match-merged with dependent data marts to provide needed information, then it is a potentially useful data store, but it is not active.

It supports a mixed workload. The workload of an active data warehouse will typically consist of tactical inquires executing concurrently with complex business intelligence (BI) queries and trickle updates. If the DW is used only for operational queries such as customer transactions or product inventory, it is not active.

Operational processing is driven by the DW. Active data warehouses do not exist in a vacuum. They exist in a processing loop.
The “outbound” activity goes from the data warehouse to the operational system by means of automated system mechanisms including triggers, special purpose programming interfaces, a message broker and an extract, transform and load (ETL) tool – though the ETL tool is not often used for outbound activities. If the data warehouse doesn’t deliver information automatically to operational systems, then it is not active. Manual intervention gets the job done, but the DW is not active.

It represents a closed-loop process. In particular, the data warehouse is used to optimize processing in the upstream operational or transactional system. The operational systems feed the data warehouse which, in turn, feeds back to the operational system to optimize the relevant transactional processing. The interfaces go in both directions. The data warehouse provides operational intelligence and, as active, can properly be described as driving operational processing.

The article has

2 responses

Written by Guru Kirthigavasan

February 21st, 2008 at 7:00 am

Extract, Transform and Load for Data Warehousing  

How else can you title the post, when the father of Data Warehouse himself, Bill Inmon, writes a nostalgic note on ETL.

When data warehousing and ETL first appeared on the scene, the coders of the world felt threatened. In case after case, the coders of the world went and found their most complicated, most arcane, most convoluted program that was needed for transformation and threw that program at the ETL vendor. When the ETL technology threw up, the programmers said – “See, we can’t use ETL here – it can’t handle the XYZ program.”

Fast forward to today, and there is ETL everywhere. It is found in almost every shop that has a data warehouse. No one thinks twice about bringing in ETL technology. The drudgery of writing and maintaining transformation code has mercifully been shifted to automation.

What happened when programmers were protecting their own turf? The programmers were selecting the most impossible example of code and transformation to use as a basis for selecting or not selecting ETL processing. Behind the one difficult program are oodles of easy and much more normal transformations. Selecting the most difficult program in the company as a basis for transformations is like selecting Yao Ming as a representative of Chinese people. Yao Ming plays in the NBA and is 7’5″ tall. To draw the conclusion that most Chinese are 7’5″ tall is to make a serious error in judgment because even though Yao Ming is Chinese and is tall, not all Chinese are nearly that tall. For that matter, neither are the Americans, the Russians, the English or anyone else. But that is exactly what the early programmers did in order to keep ETL out of the shop.

The article has

one response

Written by Guru Kirthigavasan

February 5th, 2008 at 8:02 am

IBM Data Warehouse Suite Expands with Optim Technology  

From Marketwire -

IBM (NYSE: IBM) today introduced new software that allows businesses to better manage the growth of information by integrating data warehousing and archiving capabilities. The new InfoSphere Warehouse with Optim Data Retention will help companies better manage enterprise data throughout its lifecycle and deliver optimal performance and compliance with changing business policies.

InfoSphere Warehouse combines the strength of IBM data warehouse software and recently acquired Princeton Softech’s Optim Data Retention solution to more efficiently manage large amounts of data to enhance users’ competitive advantage.

Here’s more on IBM InfoSphere™ Warehouse with Optim Data Retention.

The article has

no responses yet

Written by Guru Kirthigavasan

February 5th, 2008 at 6:35 am