Monthly Archives: May 2008

Getting Started with Unstructured Data

From TDWI Article -

To start down this path, you will obviously need to take a more holistic view of your organization’s information and technology architecture to learn what data is available to your end users. You also need to spend time learning what is missing today from the BI environment. Don’t be surprised if people at first cannot articulate their needs in this arena — most people do not believe current tools can support this analysis!

In conjunction with this internal fact-finding, stay abreast of the evolution of “unstructured” content software and service solutions. Although these concepts have been around for some time, some technological developments have emerged only recently to allow some of the more interesting analysis and integration opportunities in this area.

Finally, keep experimenting! The BI market has grown and matured substantially in the last several years, and this is an exciting new area where we can all stretch and investigate. As famous engineer Richard Buckminster Fuller once quipped, “There is no such thing as a failed experiment — only experiments with unexpected outcomes.”

Improving data quality without breaking the bank

Interesting article on Data Quality from B To B Mag-

Today, most corporations are tightening the screws on their corporate cash drawers. In attempts to conserve funds yet still reach stated revenue goals, businesses are turning to their existing customer bases to generate incremental revenues. This is a logical step as the costs to cross-sell additional products and/or services to existing customers are far less than those required to attract new customers. Obviously, this puts tremendous pressure on the accuracy of databases.

As one-to-one relationship marketing becomes more the norm, databases are expanding at exponential rates. In fact, corporate databases double in size roughly every six to nine months. These two factors alone should be sufficient stimulus for businesses to get their databases in order in a cost-effective manner. The average business database contains a staggering amount—15% to 40%—of bad data. That means roughly one in four pieces of marketing material mailed is worthless.

Thoughts on Cognos 8.3

I had a chance to watch a Cognos 8.3 demo for evaluvation purposes. The Cognos Team had created a prototype cube for the demo. Having worked with Cognos for nearly 4 years at my previous assignment, this version of Cognos was completely new to me.

The version that I worked before Cognos EP Series 7. The latest version is Cognos 8.3 and there has been fundamanetal changes made to the UI as well as the architecture.

Here are some of the key improvements that I think are striking

- Metadata management has improved significantly. Instead of metadata being stored as a file, like the previous version(catalogs), they are now stored in the database, just like how Informatica stores them.

I wish Business Objects would also make metadata management as seamless as Cognos. File based metadata management will hit the file size limitations and then they cannot grow anymore.

- The client application’s UI has become simple and helps in faster authoring of reports.

- Their web interface is very intutive. With drill down/up capabilities, enterprises can now have business analysts do adhoc queries on the fly.

- Good integration of all the different tools into one big tool. Cognos used to have Impromptu, Powerplay and host of other tools. Now they all work as one Cognos tool. This could potentially make their prices high but it could help the enterprises with one BI tool solution and could also reduce licensing costs from various tools.

- Though I didn’t dig too much into their scorecarding tools, scorecard portlets is a cool concept.

Looks like Cognos 8.4 is at Beta and is expected to hit the markets soon.

Breakthroughs in Analytics

CRM Buyer has a series on breakthrughs in analytics world. A good read. Part 1 and Part 2.

While BI is about quickly obtaining enterprise information, Web analytics encompasses the collection, analysis and reporting of information about user activity on company Web sites. This process of analyzing the behavior of Internet visitors involves the study of the impact of a Web site on its users.

Distinctions between the two should be disappearing fairly quickly, according to Gary Angel, president and CTO of Semphonic, a tool-independent Web and search engine marketing analytics consultancy.

“I think it’s clear that there’s an increasing merger between traditional BI and Web analytics,” wrote Angel on his SemAngel blog. “This is true both in terms of data integration and tools. That’s certainly going to accelerate, and I see no reason why, in three years, the two disciplines will be separate in any meaningful sense. In addition, I think we’ll start to see much more ‘data-driven’ analysis within Web analytics.”

In 2006, IDC estimated the size of the 2005 Web analytics market at US$318 million and projected it to more than double in the ensuing five years. JupiterResearch put the size of the Web analytics market at $463 million in 2006. Today, the market is above $500 million.

This growth has occurred because the Web has become a part of the marketing mix model and is proving its value, according to Jim Sterne, president of the Web Analytics Association (WAA).

“Today, Web analytics tools can do their magic from afar,” Sterne told TechNewsWorld. “Ten years ago, we weren’t thinking about selling Software as a Service. Today’s tools are also much more capable of capturing the growing quantity of data and segmenting visitors to ensure the best possible response to a click. Further, 21st century tools are becoming more integrated with other marketing systems like e-mail, direct mail, telemarketing and in-store sales.”

E-commerce companies often use Web analytics software to measure such concrete details as how many people visited their site, how many of those visitors were unique visitors, how they came to the site (i.e., if they followed a link to get to the site or came there directly), what keywords they searched with on the site’s search engine, how long they stayed on a given page or on the entire site, what links they clicked on and when they left the site. Web analytic software can also be used to monitor whether or not a site’s pages are working properly.

With this information, Web site administrators can determine which areas of the site are popular and which areas of the site do not get traffic. They can then use this data to streamline a site to create a better user experience.

Increased Online Advertising Spending


From eMarketer -

For the full year 2007, online ad revenues totaled $21.2 billion, according to the Interactive Advertising Bureau (IAB)/PricewaterhouseCoopers (PwC) “2007 Internet Advertising Revenue Report.” That was 26% higher than 2006, which was itself a record year.

The IAB and PwC said that Q4 2007 Internet advertising revenues reached $5.9 billion, the highest ever for a single quarter and 24% higher than the same period in 2006.

eMarketer predicts that despite continued strength relative to most other media, Internet ad spending growth will drop to about 16% in 2009. This slowdown reflects a combination of the maturing online ad market and overall economic weakness.

There will be a bounceback in 2010 due to a recovering economy and a much larger influx of branding-oriented ad dollars flowing online. One major source of those escalating spends will be video ads, which are relatively expensive and greatly desired.

What your cellphone knows about you – Reality Mining

Here’s a follow-up on Reality Mining and Surprise Modelling, which are called as one of the 10 technologies that we think are most likely to change the way we live.

Read more from interview with Sandy Pentland, director of MIT’s Human Dynamics Research program. What is “reality mining?”

Sandy Pentland: Reality mining is about using sensors to understand human beings. The sensors could be security cameras, they could be devices that you wear on yourself, they could be cell phones. The point is it’s about people. Data mining is about finding patterns in digital stuff. I’m more interested specifically in finding patterns in humans. I’m taking data mining out into the real world.

What kind of reality-mining experiments have you actually performed?

We developed this thing called a sociometer, a little badge that you wear around your neck that records your body language, your motion and your tone of voice–the tone, not the words. It gives us a nice little package for reality mining.

We’ve done all sorts of interesting things with this. Just listening to peoples’ tones of voice and how they move, we can measure interest level and attention, factors that account for 40% of the variation in the outcomes of things like salary negotiation, dating scenarios, closing a sale, pitching a business plan.

Microsoft and Cloud Computing

Another reason why cloud computing is the in-thing these days. From WP -

Microsoft Corp sees tens of millions of corporate e-mail accounts moving to its data centers over the next five years, shifting to a business model that may thin profit margins but generate more revenue.

In an interview ahead of the Reuters Global Technology, Media and Telecoms Summit, Chris Capossela, who manages Microsoft’s Office products, said the company will see more and more companies abandon their own in-house computer systems and shift to “cloud computing,” a less expensive alternative.

Cloud computing is the trend by Internet powerhouses to array huge numbers of computers in centralized data centers to deliver Web-based applications to far-flung users.

Microsoft built its business selling software to run on local machines, both computer servers and personal computers, but, in recent years, it has invested billions of dollars in massive data centers, which are the basic infrastructure for a wide range of Web services.

The Future of Enterprise Search

An interesting analysis from PC World -

…the search market has fragmented into a few distinct size classes, analysts say: offerings from major vendors like IBM, Oracle and with its recent acquisition of FAST Search & Transfer, Microsoft; larger independents such as Autonomy; and smaller, specialized vendors.

Arnold recently wrote a nearly 300-page study for Gilbane Group, “Beyond Search,” that takes a deep dive into the facets of the enterprise search market. While in terms of size, search-focused companies are spread among only a handful of categories, but they vary widely in terms of their technological focus. These are among the sub-segments Arnold identified:

Database-centric systems, such as Teratext and Intelligenx. “Because of this, these systems are adept at handling data management, content repurposing, and generating reports from the content that reside in the system’s database,” he wrote.

Companies involved in “deep analysis” of content, which include Attensity and Siderean Software. “The use of multiple processes in iterative cascades point to the direction search and content processing is moving. Simple key word indexing is a Model-T Ford to these vendors’ finely tuned machines.”

A browser for analytics

Read more from the Computer World article -

Based on the open-source Gecko browser engine from Mozilla, Strata is designed for people who need to create ad hoc reports from myriad data sources. It has its own scripting language so you can write a script that, say, monitors a Web site for changes to your data. Kirix also produces packaged scripts, or extensions. For example, by September, it will offer one that encrypts data going in and out of Strata.

Currently available for Windows and Linux users (a Mac version is ready but not yet shipping), Strata sells for $249 per seat.

Advance 08 – Advertising Leadership Forum

Microsoft Advance 08

Microsoft’s Advance08 – the advertising conference featuring James Cameroon & Bill Gates, starts today at Redmond.

The prime focus is on digital advertising and how the future of digital media will touch people’s life. This event is also looked forward as this will be Bill Gate’s one of his last public appearances as a full-time Microsoft employee.

The global advertising community’s most influential leaders have come together at advance08 to debate and discuss the factors affecting the future of advertising media:

How is new media reshaping the relationship between brands and consumers?
How do marketers work and communicate with their target audience?
Where is the future of media heading, and what does it look like?

Building on the successful format of SAS, advance08 focuses on the main themes that are reshaping the landscape of digital marketing. Learn more about these topics from prominent figures in the global advertising community, from Michael Eisner and James Cameron to visionary Bill Gates.