Welcome to my Blog - Happy Reading!

"Biz-Integrate" discusses the powers of business process integration and improvement, e-Commerce, and enterprise modernization and collaboration, for solving immediate business challenges and long term strategic goals.


Wednesday, August 11, 2010

The Information Explosion - Why We Hoard Data

The amount of information in the world is growing at an exponential rate. There are a number of reasons for this. Advances in technology, have led to a new generation of digital devices with increased capabilities which are digitizing information that was previously unavailable. There are also now significantly more people who interact with information. Between 1990 and 2005 more than 1 billion people worldwide entered the middle class - as societies become richer they become more literate, which fuels information growth. In recent years, a number of governments and global enterprises have embarked upon large scale information gathering initiatives – for example, Google’s “Books Library” project (to digitize and make searchable every book in the world in all languages) and the American Defense Department’s “Total Information Awareness” project (to gather and store personal information on its residents) and the European Union’s “Data Retention Directive” (to collate personal information and all communications activities on its residents) 

  All this data is reshaping our world, economically as well as socially, and there are visible signs that it is already starting to transform commerce, science, government, and everyday life. It has the potential to be for the greater good—as long as governments, consumers, and businesses make educated choices about when to restrict the flow of data, and when to encourage it.
  
  In business, organizations today hoard data primarily for two reasons - compliance and auditing, visibility and planning. Compliance and auditing ensure proper adherence to government and industry regulations regarding the integrity, accessibility, confidentiality and retention of important data. Depending on whether you are a public, private, or government entity, and upon your industry vertical, this can include Sarbanes-Oxley, HIPAA, FISMA, GLBA, Basel committee initiatives, and even e-Discovery. Compliance and auditing crosses over into data management (an industry unto itself), and goes beyond the issues of adequate backups archiving or disaster preparedness. Regulations often prescribe severe financial and criminal penalties for organizations that fail to meet established standards, forcing many organizations to re-evaluate the way their data is handled and secured. Consequently, data compliance and auditing is a center piece in modern data management practices. Visibility and planning is providing business leaders with timely, relevant, and quality information that empowers them with a better understanding of their commercial context, so that they may make intelligent decisions regarding the current and future state of their organization (e.g. predict and respond to opportunities and threats, optimize operations to capitalize on new sources of revenue, and proactively manage risk while ensuring efficiency).

  Common technology related techniques include business intelligence and business analytics.  Business intelligence (querying, reporting, and OLAP) provides insights and tools that address “what and where” - “what happened”, “where exactly is the problem”, and “what actions are needed”. Business analytics (statistical and quantitative analysis, predictive modeling and fact-based management) focuses on trends, predictions and optimizations – “why is this happening”, “what if these trends continue”, “what will happen next” and “what is the best that can happen”. Business intelligence (BI) and business analytics are continuing to evolve. The newest generation of tools and techniques focus on the spread of predictive analytics, real-time performance monitoring and stream processing technologies, the use of “in-memory” products for faster analysis, embracing open source such as the “R” programming language, and the evolution of software-as-a-service for faster deployment.

  A few industry verticals have taken the lead in their ability to collate and exploit data.  For example:

  In 2004, Florida was hit with a series of hurricanes in short succession. After “Charley” (the first of these hurricanes) had passed, Wal-Mart management analyzed Floridian consumer spending patterns immediately prior to the hurricane warning. Although bottled water, batteries and canned goods sold well, the far-and-away number one selling items were found to be beer and strawberry pop-tarts. Wal-Mart used this information to ensure their Florida stores were sufficiently stocked with these items during hurricane season leading to increased sales and revenue. This year, Wal-Mart and Yahoo are teaming up on a campaign called “365 Days of Mom”. The objective is to interrogate Yahoo’s databases (analyzing searches and clicks against demographics) to provide greater insight for Wal-Mart into the purchasing mindset of moms - when do they start helping their daughters shop for prom dresses, when do they start shopping for Valentine’s Day gifts, and so on. With the right information, Wal-Mart can operate more efficiently by tapping the right markets at the right time. Also, by analyzing “basket data”, supermarkets are now tailoring promotions to particular customers’ preferences.

  Amazon and Netflix use a statistical technique called collaborative filtering to make recommendations to users based on what other users like. 65% of their recommendations have resulted in sales, which has produced millions of dollars of additional revenue.

  EBay monitors listing activity, bidding behavior, pricing trends, search terms and the length of time users look at a page. Lots of searches but few sales for an expensive item may signal unmet demand, so eBay will find a partner to offer sellers insurance to increase listings.

  Farecast, a part of Bing, interrogates over 225 billion flight and price records so it can advise customers whether to buy an airline ticket now or wait for the price to come down. The same idea is being extended to hotel rooms and cars. By providing this superior, “value-add” service, Farecast separates itself from the competition and attracts more potential customers to its site, leading to higher revenue generation capabilities.

  By monitoring all purchases made, credit-card companies can now identify fraudulent transactions with a high degree of accuracy, using rules derived by crunching through billions of transactions. For example, stolen credit cards are more likely to be used to buy hard liquor than wine.

  Insurance firms are more tuned at spotting suspicious claims - fraudulent claims are more likely to be made on a Monday than a Tuesday, since policyholders who stage accidents tend to assemble friends as false witnesses over the weekend.

  Mobile-phone operators analyze subscriber calling patterns to determine whether most of their frequent contacts are on a rival network. If that rival network is offering an attractive promotion then this significantly increases the likelihood of that subscriber defecting - he or she is then red flagged to be offered an incentive to stay (and his or her contacts on a rival network offered incentives to defect).

  In health care the trend is towards “evidence-based medicine”, where not only doctors but computers also get involved in diagnosis and treatment. Aggregated data is mined to spot unwanted drug interactions, identify the most effective treatments and predict the onset of disease before symptoms emerge.  The on-going digitization of records will make it easier to spot and monitor health trends and evaluate the effectiveness of different treatments.

  Whereas traditional “brick and mortar” businesses generally collect information about customers from their purchases or surveys, those doing business over the internet are able to collect the complete “click exhaust” – not only purchases, but what was searched, browsed, clicked, promotions and advertisements that generated interest, and so on. Companies that grasp these new opportunities, or provide the tools for others to do so, are already reaping the rewards of their endeavors. Knowledge is power, and the information management industry already generates $100 billion annually and has a healthy 10% growth rate.


  For all the successes to date related to the data explosion, there have also been many failures. For example, during the recent financial crisis it became clear that banks and rating agencies had been relying on models which, although they required a vast amount of information to be fed in, failed to reflect financial risk in the real world. Another example is the well documented flaws in the systems used to identify potential terrorists.

  In January 2000 the tidal wave of data pouring into the National Security Agency (NSA) brought the system to its knees. The agency was “brain-dead” for over three days. As the NSA’s director stated at the time “We were dark. Our ability to process information was gone.”

  Also of concern is energy consumption - processing enormous volumes of information takes significant power. In 2006, the NSA came close to exceeding its power supply, which would have blown out its electrical infrastructure. Microsoft, Google, and similar companies, have built some of their largest data centers next to hydroelectric plants to ensure access to enough energy at a reasonable price.

  Information and technology are neither good nor bad - it depends on how they are used.  The world today contains a torrent of digital information which empowers us to accomplish things now that previously could not be done - prevent disease, combat crime, identify business trends, and so on. Managed well, the data explosion can be used to unlock new sources of economic value and provide fresh insights into science, and perhaps even hold governments to account.



No comments:

Post a Comment