The Business Intelligence Blog

Slicing Business Dicing Intelligence

Archive for the ‘Data Warehousing’ tag

The Petabyte BI World - Wired  

Sensors everywhere. Infinite storage. Clouds of processors. Our ability to capture, warehouse, and understand massive amounts of data is changing science, medicine, business, and technology. As our collection of facts and figures grows, so will the opportunity to find answers to fundamental questions. Because in the era of big data, more isn’t just more. More is different.

This month’s Wired magazine carries one of the most important growing concerns of the scientific community, the uncontrollable growth of data. This growth of data in many directions is nearly killing theories as everything is becoming more and more data controlled.

There are a series of articles ranging from what data miners are digging today to elaborate algorithms that predict air ticket prices to how we can monitor epidemics hour by hour.

If you are a BI entusiast or not, this month’s Wired cover story will challenge all your predictions about science and technology, even if you have a petabyte of data to support it !! Read it, like, right now !!

The article has

no responses yet

Written by Guru Kirthigavasan

July 15th, 2008 at 5:58 am

Ireland is facing a data tsunami  

From Silicon Republic, an interesting article on the exponential growth of data -

CURRENTLY the amount of data worldwide is doubling every 11 months – by 2010 it will double every 11 hours.

Ireland stands in the direct path of this tidal wave of data, warns a senior executive with business intelligence (BI) giant SAS.

Dr John Brocklebank, director of analytic solutions at privately held software giant SAS Institute, believes Irish firms are ill-equipped to deal with this rapid growth in the world’s data.

He says the challenge for Irish companies is to capture and exploit the 1-2pc of data that is relevant to their decision-making processes and strategic objectives.

“We know that, on average, managers spend two hours a day looking for data and that more than half of this is useless to their decision-making process,” Brocklebank explains.

“Most frightening is that 42pc of managers say they accidentally use the wrong information to make a decision at least once a week. So never mind trying to deal with the data in its entirety; it needs to be made meaningful and accurate to support business decisions.

The article has

no responses yet

Written by Guru Kirthigavasan

June 20th, 2008 at 7:28 am

SmartStream Banks on Informatica to Accelerate Customer ROI  

From the Press Release -

Informatica Corporation (Nasdaq: INFA), the leading independent provider of data integration software, today announced that SmartStream Technologies, a leading provider of software to the financial services industry, is OEMing the Informatica PowerCenter data integration platform as part of its flagship Transaction Lifecycle Management (TLM) solutions.

In making Informatica PowerCenter the foundation of its SmartStream TLM Business Integration (TLM BI) offering, SmartStream is empowering those customers with complex data environments to accelerate the return on investment of their TLM deployments through the high-performance and cost-effective integration of data involved in transaction cycles.

“The increasing drive to streamline global banking practices means our software needs to manage highly complex and rapid transactions across platforms and different banks. By using Informatica, rather than continually creating bespoke data interfaces, we can enable a faster ROI while freeing our professional services teams to provide more value to customers,” said Neil
Vernon, head of SmartStream’s Product Management Group. “We selected Informatica to power TLM BI following an evaluation where they scored highest against our key criteria of usability, reusability and performance. In addition, it was critical that TLM BI have the focus and support of a recognized best-of-breed vendor such as Informatica.”

The article has

no responses yet

Written by Guru Kirthigavasan

June 17th, 2008 at 6:02 am

Teradata Improves Analytics for Business Users  

By Anshu Shrivastava of TMCNet, an article on the new Teradata Warehouse Miner.

The technology of data mining discovers patterns in customer, financial and operational data that can provide valuable business insights, according to Teradata.

The newest enhancements to Teradata Warehouse Miner are supported by the recently announced SAS (News - Alert) Scoring Accelerator for Teradata. Keith Collins, CTO at SAS, said that Teradata’s new functionality, and the recently released SAS Scoring Accelerator, are integrated and complementary solutions.

Teradata’s officials pointed out that the an initial benchmark of the SAS Scoring Accelerator for Teradata demonstrated the ability to process the number of records “45 times faster” than the traditional scoring method.

Moreover, the SAS Scoring Accelerator for Teradata also eliminates the need for manual translation of the SAS scoring code into SQL, or structured query language.

The article has

no responses yet

Written by Guru Kirthigavasan

June 14th, 2008 at 6:36 pm

From Text Analytics to Data Warehousing  

I liked the recent article of Seth Grimes which talks about Text Analytics Accuracy. His article, today, on Intelligent Enterprise, pointed me to the IBM article on IBM® OmniFind™ Analytics Edition which talks in detail about extracting unstructured data from e-mail, Web pages, news and blog articles and building a data warehouse out of them to unlock the huge potential which was previously untapped.

In recent months/weeks, the focus on unstructured data is becoming more and more as businesses and vendors are starting to understand the power of this unstructured data and how it can text mined and used to the benefit of the exterprises. And its a good this.

A must read. Highly Recommended.

IBM OmniFind Analytics Edition

Text analytics enables you to extract more business value from unstructured data such as emails, customer relationship management (CRM) records, office documents, or any text-based data. IBM® OmniFind™ Analytics Edition provides rich text analysis capabilities and interactive visualization to enable you to find patterns and trends hidden in large quantities of unstructured information. The text analysis results from OmniFind Analytics Edition are in XML-format and can also be stored, indexed, and queried in a DB2 database. This allows you to incorporate your text analysis results into existing business applications and reporting tools by using regular SQL or SQL/XML queries. This article provides an overview of text analytics with OmniFind Analytics Edition and describes several ways of bringing its analysis results into DB2, in relational or pureXML™ format.
..
..
OmniFind Analytics Edition provides the ability to interactively explore and mine the results of text analysis, as well as structured data that is typically associated with unstructured text. For those of you familiar with business intelligence applications, you can think of it as content-centric business intelligence, in that it aggregates the results of text analysis to detect frequencies, correlations, and trends. Typical use cases include:

Analysis of customer contact information (e-mails, chats, problem tickets, contact center notes) for insight into quality or satisfaction issues
Analysis of blogs and wikis for reputation monitoring
Analysis of internal e-mail for compliance violations or for expertise location

The article has

no responses yet

Written by Guru Kirthigavasan

May 18th, 2008 at 6:29 pm

Sybase and Sun create the World’s Largest Data Warehouse  

One Petabyte of mixed relational and unstructure data. That’s neat.

More from the Sybase Press Release -

Sybase, Inc. (NYSE: SY), the largest enterprise software and services company exclusively focused on managing and mobilizing information, today announced that the Sybase® IQ analytics server has set a new Guinness World Record™ by powering the world’s largest data warehouse on a Sun™ SPARC® Enterprise M9000 server. This accomplishment was achieved using Sybase IQ, BMMsoft ServerSM and the Sun Microsystems® Data Warehouse Reference Architecture. This winning combination enables more data to be stored in less space, searched and analyzed in less time, while consuming 91 percent less energy and generating less heat and carbon dioxide than conventional solutions.

Powered by the category-leading column-oriented database Sybase IQ, the data warehouse is certified to support a record-breaking one petabyte of mixed relational and unstructured data—more than 34 times larger than the largest industry standard benchmark1 and twice the size of the largest commercial data warehouse known to date2. In total, the data warehouse contains six trillion rows of transactional data and more than 185 million content-searchable documents, such as emails, reports, spreadsheets and other multimedia objects.
..
..
Designed from the ground up as an analytics server, Sybase IQ produces its impressive results because of a unique architecture combining a column-oriented data structure with patented indexing and a scalable grid. Sybase IQ offers extraordinarily high performance at a lower cost than a traditional, row-oriented database. And, unlike traditional row-based data warehouses, the stored data in Sybase IQ is compressed by up to 70 percent of its input size, creating the most optimal and elegant analytics solutions.

“The results of this benchmark showcase Sybase IQ’s capabilities to handle real-world scenarios, querying vast amounts of data representing the transactions processed across the worldwide financial trading networks over multiple years.” said Francois Raab, president, InfoSizing, the consulting firm that oversaw the benchmarking of the record. “Sybase IQ has proven its production strength in handling the volume of multimedia documents representative of the electronic communication between half a million financial traders.”

Data Warehousing on a Shoestring Budget  

TDWI is running a series on developing and deploying Data Warehousing, frugally. It’s a 3 part series. Read Part 1 and 2.

Although seemingly difficult, you can make choices, which allow for the beneficial realization of data warehousing while also minimizing costs. By balancing technology and carefully positioning your business, your organization can quickly create cost-effective solutions using data warehousing technologies.

There are a few simple rules to help you develop a data warehouse on a shoestring budget:

* Use what you have
* Use what you know
* Use what is free
* Buy only what you have to
* Think small and build in phases
* Use each phase to finance or justify the remainder of the projects

It’s also a must read for businesses which have enough business sponsorship and enormous resources. Tough times in the marketplace like these call for an economical way of staying ahead on the business curve. And that’s exactly the point of this series.

I like the detailed approach Nathan Rawling towards this topic.

The article has

no responses yet

Written by Guru Kirthigavasan

May 13th, 2008 at 9:24 pm

Protegrity & Teradata announce DW encryption performance  

Press Release - Protegrity and Teradata today announce unmatched data warehouse encryption performance

Protegrity Corporation, the leading provider of Data Security Management solutions, and Teradata, the global leader in Enterprise Data Warehousing today announced new cryptography performance of over 6 million decryptions and over 9 million encryptions per second, enabling customers to maximize data protection while minimizing impact on business operations.

“Our research shows that businesses worry that data protection may impact application performance. These test results show that the partnership between Teradata and Protegrity delivers proven enterprise solutions that help customers protect data and achieve regulatory compliance with industry-leading system performance,” said Gordon Rapkin, president and chief executive officer of Protegrity.

The Protegrity Defiance® Defiance DPS uses Teradata User Defined Functions (UDFs) to embed encryption/decryption functionality in the database. Teradata’s high performance UDF implementation and parallel architecture provides highly efficient execution, and then scales that performance linearly as the system grows in size.

“High performance data protection enables our customers to confidently meet their privacy and compliance obligations,” said Randy Lea, vice president, Teradata products and services. “The efficiency of this data protection solution frees our customers from the technical challenges to focus on the value of business analytics. This data, rich with insight, provides a better understanding of consumers.”

The test included the Defiance® Data Protection System utilizing industry standard strong Advanced Encryption Standard and Teradata 12 on a six-node (12 Intel® Xeon® processors) Teradata 5550 Platform.

Since 2004, Protegrity and Teradata have collaborated on enterprise-level data encryption and management. In June 2005, the companies announced a global partnership to deliver database security for Teradata customers.

ABOUT TERADATA
Teradata Corporation (NYSE: TDC) is the world’s largest company solely focused on raising intelligence through data warehousing and enterprise analytics. Teradata is in more than 60 countries and on the Web at www.teradata.com.

ABOUT PROTEGRITY
Protegrity delivers centralized data security management solutions that protect sensitive information from acquisition to deletion across the enterprise. Protegrity’s customers maintain complete protection over their data and business by employing software and solutions specifically designed to encrypt data, safeguard web applications, and manage and report on security policy.

The company’s singular focus is on developing solutions that protect data. Protegrity employees are security technology specialists with deep expertise in encryption, key management, web application firewalls and security policy in distributed environments. Maximize security with minimal business impact with Protegrity’s Defiance® Suite, the high performance, transparent solution optimized for the dynamic enterprise.

To learn more, visit www.protegrity.com or call 203.326.7200

The article has

no responses yet

Written by Guru Kirthigavasan

May 13th, 2008 at 6:43 am

Vertica Moves BI Database to Amazon’s Cloud  

Pay-as-you-go in the Data Warehousing and Business Intelligence space is getting some more traction as Vertica moves its technology to Amzon’s Elastic Compute Cloud infrastructure computing infra-structure.

Vertica’s database organizes data by columns, as opposed to rows. The company and others that make columnar databases, such as Sybase and ParAccel, contend the approach is faster and better for BI-related queries because only the desired columns — such as a customer’s name or location — can be read without having to parse through an entire table, saving bandwidth.

The company also sells the database for on-premises use and in appliance form.

It sees a market for the on-demand offering due to a number of scenarios. For example, a company might want to conduct a BI project that will only last a fixed period of time, such as revising its pricing based on competitive and market data, said Andy Ellicott, senior director of marketing.

Hedge funds, which test their trading algorithms against large sets of historical stock market data, are another potential use case, because while such entities manage vast amounts of money, they seek to maintain the lowest possible overhead, which a cloud-based approach can provide, he said.

The article has

no responses yet

Written by Guru Kirthigavasan

May 11th, 2008 at 9:46 am

A Brief History of ETL - Bill Inmon  

It’s Bill Inmon, the father of data warehousing, talking about the history ETL. Must Read. An article with lots of vintage value.

On March 15, 1995, Prism Solutions went public on the NASDAQ exchange (PRSM). And then, there were even more competitors that entered the market space. One entrant was Ab Initio. Ab Initio specialized in the movement of very large amounts of data.

In a related space was enterprise application integration (EAI). EAI has many of the capabilities of a data warehouse extract, transform and load (ETL) tool when it comes to moving data about the corporation. However, EAI falls short when it comes to handling transformation and metadata management. Nevertheless, there is some degree of overlap between the worlds of ETL and EAI.

The article has

no responses yet

Written by Guru Kirthigavasan

May 8th, 2008 at 6:51 pm

Posted in ETL

Tagged with , , ,