Category Archives: Data Quality

So Much Data, So Little Encryption

If you go solely by top-level stats on encryption use, you’ll come away feeling pretty secure–86% of the the 499 business technology professionals responding to our InformationWeek Analytics State of Encryption Survey employ encryption of some type. But that finding doesn’t begin to tell the real story. Only 14% of respondents say encryption is pervasive in their organizations. Database table-level encryption is in use by just 26%, while just 38% encrypt data on mobile devices. And 31%–more than any other response–characterize the extent of their use as just enough to meet regulatory requirements.

The reasons for this dismal state of affairs range from cost and integration challenges to entrenched organizational resistance exacerbated by a lack of leadership. The compliance focus is particularly galling. Encrypting a subset of data amounts to a “get-out-of-jail-free card” because it may relieve companies from having to notify customers of a breach. But knowingly doing the bare minimum to check a compliance box isn’t security; it’s a cop-out.

From an interesting post.

The Flaws of the Classic Data Warehouse Architecture

This CDWA has served us well the last twenty years. In fact, up to five years ago we had good reasons to use this architecture. The state of database, ETL, and reporting technology did not really allow us to develop something else. All the tools were aimed at supporting the CDWA. But the question right now is: twenty years later, is this still the right architecture? Is this the best possible architecture we can come up with, especially if we consider the new demands and requirements, and if we look at new technologies available in the market? My answer would be no! To me, we are slowly reaching the end of an era. An era where the CDWA was king. It is time for change. This article is the first in a series on the flaws of the CDWA and on an alternative architecture, one that fits the needs and wishes of most organizations for (hopefully) the next twenty years. Let’s start by describing some of the CDWA flaws.

The first flaw is related to the concept of operational business intelligence. More and more, organizations show interest in supporting operational business intelligence. What this means is that the reports that the decision makers use have to include more up-to-date data. Refreshing the source data once a day is not enough for those users. Decision makers who are quite close to the business processes especially need 100% up-to-date data. But how do you do this? You don’t have to be a technological wizard to understand that, if data has to be copied four or five times from one data storage layer to another, to get from the production databases to the reports, doing this in just a few seconds will become close to impossible. We have to simplify the architecture to be able to support operational business intelligence. Bottom line, what it means is that we have to remove data storage layers and minimize the number of copy steps.

Great read from BEye Network. Part 1 and Part 2.

Informatica Positioned In Leaders Quadrant In Data Quality Tools

From Informatica Press Release, this is an important win for Informatica.

Informatica Corporation (NASDAQ: INFA), the leading independent provider of data integration software and services, today announced that it has been positioned by Gartner, Inc. in the leaders’ quadrant in the 2008 Magic Quadrant for Data Quality Tools report.

Ted Friedman and Andreas Bitterer, authors of the report state, “leaders in the market demonstrate strength across a complete range of data quality functionality, including profiling, parsing, standardization, matching, validation and enrichment. They exhibit a clear understanding and vision of where the market is headed, including recognition of noncustomer data quality issues and the delivery of enterprise-level data quality implementations. Leaders have an established market presence, significant size and a multinational presence.”

According to the report, “growth, innovation and volatility (via mergers and acquisitions) continue to shape the market for data quality tools. Investment on the part of buyers and vendors is increasing as organizations recognize the value of these tools in master data management and information governance initiatives.” The complete report, including the quadrant graphic, is available on the Informatica web site at

Getting Started with Unstructured Data

From TDWI Article -

To start down this path, you will obviously need to take a more holistic view of your organization’s information and technology architecture to learn what data is available to your end users. You also need to spend time learning what is missing today from the BI environment. Don’t be surprised if people at first cannot articulate their needs in this arena — most people do not believe current tools can support this analysis!

In conjunction with this internal fact-finding, stay abreast of the evolution of “unstructured” content software and service solutions. Although these concepts have been around for some time, some technological developments have emerged only recently to allow some of the more interesting analysis and integration opportunities in this area.

Finally, keep experimenting! The BI market has grown and matured substantially in the last several years, and this is an exciting new area where we can all stretch and investigate. As famous engineer Richard Buckminster Fuller once quipped, “There is no such thing as a failed experiment — only experiments with unexpected outcomes.”

Datactics Delivers Data Quality Management

From the Press Release

“With unicode, data agnostic and grid enabled computing functionality embedded in its technology Datactics can resolve complex data quality issues emerging as a result of the global nature of businesses. “Being an innovator in the data quality management arena, Datactics’ applications have evolved to incorporate sophisticated functionality that exceeds the requirements of traditional data cleansing,” says Sarah Bearder, Cofounder of Datactics. “Datactics can effectively manage product data in multiple languages and metrics distributed across various computing resources to increase global purchasing leverage and optimize supply chain operations. Foremost the deployment of grid enabled computing means organizations can now manage highly intensive data quality processes by simply adding in-house commodity computers to increase processing speed, which provides a cost effective means of gaining super computing power.”