The Business Intelligence Blog

Slicing Business Dicing Intelligence

Big Data – 5 low-profile startups  

Odiago is the brainchild of Hadoop and analytics experts Christophe Bisciglia and Aaron Kimball, and aims to improve the state of web analytics. Its first product, Wibidata, which is in private beta, lets websites better analyze their user data to build more-targeted features. It’s built atop Hadoop and HBase, but also plugs into companies’ existing data-management and BI tools. Current customers include Wikipedia, RichRelevance, FoneDoktor and Atlassian (with whom it shares office space).

- More on the 5 start-ups working on Big Data.

The article has

no responses yet

Written by Guru Kirthigavasan

February 6th, 2012 at 11:48 am

Posted in Big Data

Privacy in the Age of Big Data  

In a well researched article, Privacy in the Age of Big Data, Prof. Omer Tene & Jules Polonetsky bring up some really interesting arguments for privacy in the big data age, that we currently are. Good read if you are dealing with big data.

The harvesting of large data sets and the use of analytics clearly implicate privacy concerns. The tasks of ensuring data security and protecting privacy become harder as information is multiplied and shared ever more widely around the world. Information regarding individuals’ health, location, electricity use, and online activity is exposed to scrutiny, raising concerns about profiling, discrimination, exclusion, and loss of control. Traditionally, organizations used various methods of de-identification (anonymization, pseudonymization, encryption, key-coding, data sharding) to distance data from real identities and allow analysis to proceed while at the same time containing privacy concerns. Over the past few years, however, computer scientists have repeatedly shown that even anonymized data can often be re-identified and attributed to specific individuals.[7] In an influential law review article, Paul Ohm observed that “[r]eidentification science disrupts the privacy policy landscape by undermining the faith that we have placed in anonymization.”[8] The implications for government and businesses can be stark, given that de-identification has become a key component of numerous business models, most notably in the contexts of health data (regarding clinical trials, for example), online behavioral advertising, and cloud computing.

The article has

no responses yet

Written by Guru Kirthigavasan

February 4th, 2012 at 11:45 pm

Microsoft names SQL Server 2012 launch date  

Microsoft’s SQL Server 2012 will be officially launched on 7 March. More from All about Microsoft.

Microsoft Server and Tools chief Satya Nadella revealed last fall that SQL Server 2012 (codenamed “Denali”) would launch in the early part of 2012. Microsoft delivered the final public test build of SQL Server 2012 in November 2011.

The March 7 launch event topic list includes everything from big data, to StreamInsight complex event processing, to the new data-visualization and analysis tools that are part of the SQL Server 2012 release.

Disclosure: I work at Microsoft.

The article has

one response

Written by Guru Kirthigavasan

January 24th, 2012 at 7:41 am

101 – Data Mining and Predictive Analytics  

In today’s world mining of text, Web and media (unstructured data) plus structured data mining, the term information mining is a more appropriate label. Mining a combination of these, companies are able to make the best use of structured data, unstructured text and social media. Static and stagnant predictive models of the past don’t work well in the world we live in today. Predictive analytics should be agile to adapt and monetize on quickly changing customer behaviors in our world, which are often identified online and through social networks.

Better integration of data mining software with the source data at one end and with the information consumption software at the other end has led to improvement in the integration of predictive analytics with day-to-day business. Even though there haven’t been significant advancements in predictive algorithms, the ability to apply large data sets to models and the ability to enable better interaction with business has led to improvements in the overall outcome of the exercise.

There is a great introduction to the world of data mining and predictive analytics here.

The article has

no responses yet

Written by Guru Kirthigavasan

January 24th, 2012 at 7:31 am

Is there sunshine ahead for cloud computing?  

The global cloud-computing market is expected to reach $241 billion in 2020, up from $41 billion in 2010, according to Forrester Research. That long-term potential is reflected in the highflying stocks of companies actively involved in the concept.

A stumbling block, however, is concern over the security of data when a client firm can no longer control it on its own premises. Hackers and crashed systems are, after all, among a company’s worst nightmares.

And while the cloud is a definite boon to smaller firms, more established companies have already made significant investments in equipment and staffing. There is also confusion over what cloud computing really is and who provides it.

The field’s successful pioneer is Salesforce.com Inc., a well-managed company that over the last decade effectively introduced this cost-saving business model. It offered a monthly subscription service that allowed firms to simply go to their Web browsers, point to salesforce.com and begin using it. That turned out to be a good financial deal for its clients as well as for its shareholders.

Good Read…

The article has

2 responses

Written by Guru Kirthigavasan

July 31st, 2011 at 6:20 am

The Jargon of the Novel, Computed  

Scholars in the growing field of digital humanities can tackle this question by analyzing enormous numbers of texts at once. When books and other written documents are gathered into an electronic corpus, one “subcorpus” can be compared with another: all the digitized fiction, for instance, can be stacked up against other genres of writing, like news reports, academic papers or blog posts.

One such research enterprise is the Corpus of Contemporary American English, or COCA, which brings together 425 million words of text from the past two decades, with equally large samples drawn from fiction, popular magazines, newspapers, academic texts and transcripts of spoken English. The fiction samples cover short stories and plays in literary magazines, along with the first chapters of hundreds of novels from major publishers. The compiler of COCA, Mark Davies at Brigham Young University, has designed a freely available online interface that can respond to queries about how contemporary language is used. Even grammatical questions are fair game, since every word in the corpus has been tagged with a part of speech.

More…

The article has

one response

Written by Guru Kirthigavasan

July 31st, 2011 at 6:18 am

Microsoft Gives the Cloud to Scientists  

More in NYTimes

The software maker has started grafting popular scientific databases and analysis tools onto its Windows Azure cloud computing service. This basically means that researchers in various fields can get access to a fast supercomputer of their very own and pose queries to enormous data sets that Microsoft keeps up to date. For the time being, Microsoft will allow some research groups to perform their work free, while others will have to rent calculation time on Azure via a credit card.

These moves have turned Somsak Phattarasukol, a graduate student at the University of Washington in Seattle, into a big fan of Microsoft.

Mr. Phattarasukol, like many researchers, is accustomed to waiting in line for access to large, public computers and to twiddling his thumbs – sometimes for days – as the machines work on his requests. It’s a frustrating process only made worse as the databases the researchers deal with swell alongside the time it takes to perform the analysis.

Microsoft officially opened access to the scientific bits of Azure this week, but Mr. Phattarasukol got early access to the system. He’s part of a team that’s trying to create a biofuel from bacteria that produce hydrogen gas. The work has required the research team to compare the makeup of various bacterium strains against an extensive protein database, as they try to figure out which bits of genetic code can prompt higher hydrogen gas production.

The article has

7 responses

Written by Guru Kirthigavasan

November 18th, 2010 at 5:32 am

6 Security ‘Must Haves’ For Cloud Computing  

According to Gartner, to achieve effective and safe private cloud computing deployments, security, as it exists in virtualized data centers, needs to evolve and become independent of the physical infrastructure that includes servers, Internet Protocol (IP) addresses, Media Access Control (MAC) address and a lot more.

However, it must not be bolted on as an afterthought once companies move from enterprise deployments, to virtualized centers, to private/public cloud.

While the basic components of security in information management remain the same — ensuring the confidentiality, integrity, authenticity, access and audit of information and workloads — a new, integrated approach to security will be required.

More from CMSWire

The article has

one response

Written by Guru Kirthigavasan

November 11th, 2010 at 7:30 am

Microsoft Unveils Database Products at PASS Conference  

Microsoft released the first community technology preview (CTP) for the next-generation version of SQL Server, codenamed Denali, Nov. 9. But that is just one of several announcements to come out of the PASS Summit 2010 conference in Seattle this week. In addition to unveiling Denali, Microsoft also announced the release of SQL Server 2008 R2 Parallel Data Warehouse and the new Critical Advantage Program, which offers an end-to-end suite of pretested hardware and software configurations, services and support.

“SQL Server code-named Denali will help empower organizations to be more agile in today’s competitive market,” the SQL Server Team touted on its blog. “Customers will be able to efficiently deliver mission-critical solutions through a highly scalable and available platform. Industry-leading tools will help developers quickly build innovative applications while data integration and management tools help deliver credible data reliably to the right users and new user experiences expand the reach of BI to enable meaningful insights.”

More on EWeek

In Interview – Consider CloudHosting Your Business Intelligence  

// Jaspersoft’s experience with more than 100 successful cloud BI deployments has made us realize that a partnership, best-of-breed approach to cloud BI is the best way to go. BI as a service through on-demand SaaS (News – Alert) deployments are generally singular offerings that are overstretched, offer limited flexibility, and generally need to be built from the ground-up, resulting in costly down-time and high implementation costs. One of the best practices that we’ve established from our multiple launches is that customers need to have a cloud hosting-enhanced BI solution with a lean framework. Jaspersoft’s lean architecture based on web-based open standards coupled with experts in cloud management and BI consulting results in a proven solution than can meet a myriad of business needs. ..

More from an interview with Karl Van den Bergh, vice president of product strategy at Jaspersoft.

The article has

4 responses

Written by Guru Kirthigavasan

November 3rd, 2010 at 11:40 pm