The Business Intelligence Blog

Slicing Business Dicing Intelligence

Archive for the ‘Data Warehousing’ Category

Sybase and Sun create the World’s Largest Data Warehouse  

One Petabyte of mixed relational and unstructure data. That’s neat.

More from the Sybase Press Release -

Sybase, Inc. (NYSE: SY), the largest enterprise software and services company exclusively focused on managing and mobilizing information, today announced that the Sybase® IQ analytics server has set a new Guinness World Record™ by powering the world’s largest data warehouse on a Sun™ SPARC® Enterprise M9000 server. This accomplishment was achieved using Sybase IQ, BMMsoft ServerSM and the Sun Microsystems® Data Warehouse Reference Architecture. This winning combination enables more data to be stored in less space, searched and analyzed in less time, while consuming 91 percent less energy and generating less heat and carbon dioxide than conventional solutions.

Powered by the category-leading column-oriented database Sybase IQ, the data warehouse is certified to support a record-breaking one petabyte of mixed relational and unstructured data—more than 34 times larger than the largest industry standard benchmark1 and twice the size of the largest commercial data warehouse known to date2. In total, the data warehouse contains six trillion rows of transactional data and more than 185 million content-searchable documents, such as emails, reports, spreadsheets and other multimedia objects.
..
..
Designed from the ground up as an analytics server, Sybase IQ produces its impressive results because of a unique architecture combining a column-oriented data structure with patented indexing and a scalable grid. Sybase IQ offers extraordinarily high performance at a lower cost than a traditional, row-oriented database. And, unlike traditional row-based data warehouses, the stored data in Sybase IQ is compressed by up to 70 percent of its input size, creating the most optimal and elegant analytics solutions.

“The results of this benchmark showcase Sybase IQ’s capabilities to handle real-world scenarios, querying vast amounts of data representing the transactions processed across the worldwide financial trading networks over multiple years.” said Francois Raab, president, InfoSizing, the consulting firm that oversaw the benchmarking of the record. “Sybase IQ has proven its production strength in handling the volume of multimedia documents representative of the electronic communication between half a million financial traders.”

Teradata Announces New Family of Analytical Platforms  

Teradata Corporation has introduced a new family of platforms from entry-level to active enterprise data warehouses that addresses many customer needs (especially of Indian and other similar markets). Powered by the Teradata 12.0 database engine, the family includes Teradata 550 SMP (symmetric multiprocessing), Teradata 2500, and Teradata 5550.

Teradata 550 SMP is a departmental data warehouse, designed to meet customers’ need for a smaller, less expensive system. It is simple to set up and can use the Novell SUSE Linux 64-bit operating system or Windows. In addition, customers can license and run the Teradata 12 database on their choice of Intel-based platforms, starting at $40,000. Teradata Express Edition, a free developer version of Teradata 12, is available to work on Windows servers and laptops for development, testing and learning.

From EFY News article.

The article has

2 responses

Written by Guru Kirthigavasan

May 15th, 2008 at 6:23 am

Data Warehousing on a Shoestring Budget  

TDWI is running a series on developing and deploying Data Warehousing, frugally. It’s a 3 part series. Read Part 1 and 2.

Although seemingly difficult, you can make choices, which allow for the beneficial realization of data warehousing while also minimizing costs. By balancing technology and carefully positioning your business, your organization can quickly create cost-effective solutions using data warehousing technologies.

There are a few simple rules to help you develop a data warehouse on a shoestring budget:

* Use what you have
* Use what you know
* Use what is free
* Buy only what you have to
* Think small and build in phases
* Use each phase to finance or justify the remainder of the projects

It’s also a must read for businesses which have enough business sponsorship and enormous resources. Tough times in the marketplace like these call for an economical way of staying ahead on the business curve. And that’s exactly the point of this series.

I like the detailed approach Nathan Rawling towards this topic.

The article has

no responses yet

Written by Guru Kirthigavasan

May 13th, 2008 at 9:24 pm

Protegrity & Teradata announce DW encryption performance  

Press Release – Protegrity and Teradata today announce unmatched data warehouse encryption performance

Protegrity Corporation, the leading provider of Data Security Management solutions, and Teradata, the global leader in Enterprise Data Warehousing today announced new cryptography performance of over 6 million decryptions and over 9 million encryptions per second, enabling customers to maximize data protection while minimizing impact on business operations.

“Our research shows that businesses worry that data protection may impact application performance. These test results show that the partnership between Teradata and Protegrity delivers proven enterprise solutions that help customers protect data and achieve regulatory compliance with industry-leading system performance,” said Gordon Rapkin, president and chief executive officer of Protegrity.

The Protegrity Defiance® Defiance DPS uses Teradata User Defined Functions (UDFs) to embed encryption/decryption functionality in the database. Teradata’s high performance UDF implementation and parallel architecture provides highly efficient execution, and then scales that performance linearly as the system grows in size.

“High performance data protection enables our customers to confidently meet their privacy and compliance obligations,” said Randy Lea, vice president, Teradata products and services. “The efficiency of this data protection solution frees our customers from the technical challenges to focus on the value of business analytics. This data, rich with insight, provides a better understanding of consumers.”

The test included the Defiance® Data Protection System utilizing industry standard strong Advanced Encryption Standard and Teradata 12 on a six-node (12 Intel® Xeon® processors) Teradata 5550 Platform.

Since 2004, Protegrity and Teradata have collaborated on enterprise-level data encryption and management. In June 2005, the companies announced a global partnership to deliver database security for Teradata customers.

ABOUT TERADATA
Teradata Corporation (NYSE: TDC) is the world’s largest company solely focused on raising intelligence through data warehousing and enterprise analytics. Teradata is in more than 60 countries and on the Web at www.teradata.com.

ABOUT PROTEGRITY
Protegrity delivers centralized data security management solutions that protect sensitive information from acquisition to deletion across the enterprise. Protegrity’s customers maintain complete protection over their data and business by employing software and solutions specifically designed to encrypt data, safeguard web applications, and manage and report on security policy.

The company’s singular focus is on developing solutions that protect data. Protegrity employees are security technology specialists with deep expertise in encryption, key management, web application firewalls and security policy in distributed environments. Maximize security with minimal business impact with Protegrity’s Defiance® Suite, the high performance, transparent solution optimized for the dynamic enterprise.

To learn more, visit www.protegrity.com or call 203.326.7200

The article has

one response

Written by Guru Kirthigavasan

May 13th, 2008 at 6:43 am

Are you the Best-in-class company on “Time-to-Information” ?  

That’s really a great news to the BI world during these dull days of business. I’m sure other companies will start becoming confident on the never understood phenomenon of ROI. Makes a day !!

According to research presented in a new report, “Data Management for Business Intelligence,” 77% of Best-in-Class companies are able to automate the integration of data from multiple sources, compared to 54% of Industry Average companies, and only 22% of Laggards. This capability was identified by respondents as being critical for solving the top business pressure — the need to reduce the time-to-information for non-technical end-users.

Best-in-Class companies are making investments in technology enablers to alleviate this pressure. 82% of Best-in-Class companies are currently utilizing data warehousing software solutions, versus 56% of all other respondents. 80% of Best-in-Class companies are currently deploying Business Intelligence query and reporting tools, versus 47% of all other respondents. Finally, 68% of Best-in-Class companies are currently implementing data warehouse appliance technology (packaged hardware and software solutions) versus only half that number (34%) of all other respondents.

The article has

one response

Written by Guru Kirthigavasan

April 1st, 2008 at 9:51 am

TDWI World Conference – May 11- 16, 2008  

The Data Warehusing Institue’s World Conference is happening at Chicago from May 11-16, 2008. For more information on registrations, here. Here’s a PDF brochure.

The article has

no responses yet

Written by Guru Kirthigavasan

March 24th, 2008 at 6:58 pm

@ctive Data Warehousing aka Closed Loop Processing  

Active Data Warehousing isn’t really a buzzword. Its been in the industry for a while. Thanks to Teradata who made this buzzword popular. They called it @ctive Data Warehousing and branded the spelling.

The reason to bring back this term is because -

Teradata Corporation (NYSE: TDC), the global leader in enterprise data warehousing, announced today that Highmark Inc., the largest health insurer in Pennsylvania and one of the largest in the U.S., continues to expand its Teradata Warehouse and its multiple-terabyte information assets. The large-scale analytics expansion increases the company’s production environment and supports the shift to active data warehousing (ADW).

For newbies, here’s more about ACtive Data Warehousing from DM Review of April 2004(yep, its 4 years old enough) -

Active data warehousing is a process, not a specific technology. Teradata has popularized the term “active data warehousing,” tried to brand the spelling “@ctive data warehousing” and deserves credit for providing examples of some big, successful active data warehouses. However, if a more generic term is preferred, then “closed-loop processing” is a useful synonym, though it only partially captures the concept. Your data warehouse (DW) is active if

It represents a single, canonical state of the business (version of the truth). Too often, companies put data into a data warehouse and also store it in a plethora of other data stores. If a data warehouse must be match-merged with dependent data marts to provide needed information, then it is a potentially useful data store, but it is not active.

It supports a mixed workload. The workload of an active data warehouse will typically consist of tactical inquires executing concurrently with complex business intelligence (BI) queries and trickle updates. If the DW is used only for operational queries such as customer transactions or product inventory, it is not active.

Operational processing is driven by the DW. Active data warehouses do not exist in a vacuum. They exist in a processing loop.
The “outbound” activity goes from the data warehouse to the operational system by means of automated system mechanisms including triggers, special purpose programming interfaces, a message broker and an extract, transform and load (ETL) tool – though the ETL tool is not often used for outbound activities. If the data warehouse doesn’t deliver information automatically to operational systems, then it is not active. Manual intervention gets the job done, but the DW is not active.

It represents a closed-loop process. In particular, the data warehouse is used to optimize processing in the upstream operational or transactional system. The operational systems feed the data warehouse which, in turn, feeds back to the operational system to optimize the relevant transactional processing. The interfaces go in both directions. The data warehouse provides operational intelligence and, as active, can properly be described as driving operational processing.

The article has

2 responses

Written by Guru Kirthigavasan

February 21st, 2008 at 7:00 am

Data Modelling – Pros and Cons  

How to insulate an organization against change – Data Modelling. Read more on its Pros and Cons.

Industry experts concur — to a degree. For one thing, says veteran data warehouse architect Mark Madsen, a principal with consultancy Third Nature and author of Clickstream Data Warehousing, what proponents such as Kalido and Sybase mean by “judicious” use of a data modeling tool takes an awful lot for granted.

“It presupposes that if you have all of your systems’ data models in a tool, then changes that are imposed will be easy,” he says, citing system changes, upgrades, and merger/acquisition activity that brings new systems into the fold as three among many common disruptions. “That’s like saying that because you have a map of outer Mongolia, a trip from one side to the other will be a simple matter of driving.”

Madsen isn’t entirely dismissive, just skeptical. “I accept that having the data models together and linked will help things like compliance efforts. For example, Visa [requires] that you control access to all databases that contain credit card numbers,” he acknowledges, “but that also presupposes that the models are somehow kept up to date, something that is generally pretty unlikely and usually a manual process.”

The article has

no responses yet

Written by Guru Kirthigavasan

February 15th, 2008 at 6:39 am

Extract, Transform and Load for Data Warehousing  

How else can you title the post, when the father of Data Warehouse himself, Bill Inmon, writes a nostalgic note on ETL.

When data warehousing and ETL first appeared on the scene, the coders of the world felt threatened. In case after case, the coders of the world went and found their most complicated, most arcane, most convoluted program that was needed for transformation and threw that program at the ETL vendor. When the ETL technology threw up, the programmers said – “See, we can’t use ETL here – it can’t handle the XYZ program.”

Fast forward to today, and there is ETL everywhere. It is found in almost every shop that has a data warehouse. No one thinks twice about bringing in ETL technology. The drudgery of writing and maintaining transformation code has mercifully been shifted to automation.

What happened when programmers were protecting their own turf? The programmers were selecting the most impossible example of code and transformation to use as a basis for selecting or not selecting ETL processing. Behind the one difficult program are oodles of easy and much more normal transformations. Selecting the most difficult program in the company as a basis for transformations is like selecting Yao Ming as a representative of Chinese people. Yao Ming plays in the NBA and is 7’5″ tall. To draw the conclusion that most Chinese are 7’5″ tall is to make a serious error in judgment because even though Yao Ming is Chinese and is tall, not all Chinese are nearly that tall. For that matter, neither are the Americans, the Russians, the English or anyone else. But that is exactly what the early programmers did in order to keep ETL out of the shop.

The article has

one response

Written by Guru Kirthigavasan

February 5th, 2008 at 8:02 am

IBM Data Warehouse Suite Expands with Optim Technology  

From Marketwire -

IBM (NYSE: IBM) today introduced new software that allows businesses to better manage the growth of information by integrating data warehousing and archiving capabilities. The new InfoSphere Warehouse with Optim Data Retention will help companies better manage enterprise data throughout its lifecycle and deliver optimal performance and compliance with changing business policies.

InfoSphere Warehouse combines the strength of IBM data warehouse software and recently acquired Princeton Softech’s Optim Data Retention solution to more efficiently manage large amounts of data to enhance users’ competitive advantage.

Here’s more on IBM InfoSphere™ Warehouse with Optim Data Retention.

The article has

no responses yet

Written by Guru Kirthigavasan

February 5th, 2008 at 6:35 am