lucidworks: Dark Data

Operational data that is not being actively used is called Dark Data. “Information assets that organizations collect, process and store in the course of their regular business activity, but generally fail to use for other purposes.” — as described by consulting and market research company Gartner Inc.

Most organizations retain dark data for a various reasons, however they are only analyzing 1% of their data. Often it is stored for regulatory compliance and record keeping. Some organizations believe that dark data could be useful to them in the future, once they have acquired better analytic and business intelligence technology to process the information.

With the growing accumulation of structured, unstructured and semi-structured data in organizations — increasingly through the adoption of big data applications — dark data has come especially to denote operational data that is left unanalyzed. Such data is seen as an economic opportunity for companies if they can take advantage of it to drive new revenues or reduce internal costs. Some examples of data that is often left dark include server log files that can give clues to website visitor behavior, customer call detail records that can indicate consumer sentiment and mobile geolocation data that can reveal traffic patterns to aid in business planning.

Many companies in the IT sector are looking at creating “cognitive computer systems” that are able to analyse unstructured dark data. The IBM Watson is considered to be a future system that would be able to analyse this unstructured data and be able to produce meaningful results that will utilise a lot of dark data that it is either practically impossible or very difficult to process at present. In terms of current systems, IBM have advertised the IBM Spark as a system that “can extract insight from that information almost immediately. This enables businesses to build data rich products and services that use that information to transform the customer experience.”

Companies will begin to troll the wealth of information that is contained in paper-based documents, photos, videos, and other corporate assets that are lying dormant in vaults and storage closets but that could be put to use in big data aggregation. These assets can give organizations a more comprehensive view of historical performance trends and product cycles that can be useful for planning. The data can also provide supporting evidence for trademark infringement and/or intellectual property violation claims.

Read the full article by Mary Shacklett on why you should devote as much time to dark data as big data.

Leave a Reply

Your email address will not be published. Required fields are marked *