The price of disks has been dropping for years. According to Gartner, the cost of disk storage per terabyte has been falling, too. Additionally, distributed computing, virtual machines and on-demand storage capacity that can be ramped up or down according to a business' needs all have combined to lower the total cost of ownership ("TCO") for storage. This has led many business executives to believe that the TCO for data storage will continue to decline ad infinitum, allowing them to collect all the data they would like to use to improve performance and drive top-line revenues.
All this would be true if not for several inconvenient truths.
Market research firm IDC estimates that the amount of all digital data created and consumed in 2012 was 2,837 exabytes. (One exabyte equals a million terabytes.) And that number is forecast to double every two years, reaching 40,000 exabytes by 2020.
Meanwhile, ICT Analytics reports that the amount of data being stored is increasing, on average, 45 percent annually. In fact, storage is the fastest growing cost within the enterprise data center.
But, one asks, what about the cloud? Doesn't cloud computing permit businesses to outsource storage to providers at a fraction of the cost of a proprietary data center?
Yes it does for some types of data. But it gets complicated for critical data. Data privacy laws vary by industry, by country and even sometimes from state to state. The cloud storage providers' business model typically assumes they can move data freely from jurisdiction to jurisdiction — optimizing server capacity and availability and, thereby, controlling their own costs. Adding jurisdiction-specific requirements to a hosting contract often can increase the cost significantly.
In practice, with the rapid acceleration of the volume of data generated (all those exabytes produced by the proliferation of sensors, tablets and smartphones) and the concomitant increase in the data that businesses are storing, the total cost of data storage is not (despite conventional wisdom) declining. How could it? Walmart, for example, handles more than a million customer transactions each hour and imports those transactions into a database estimated to contain more than 2.5 petabytes of data.
Do the math.
If a hypothetical company stores one petabyte of data this year, it will store 1.45 petabytes next year.
If the cost to store data drops 15 percent a year (or even 30 percent at the high end) while volume grows 40 percent, it's easy to see that the conventional wisdom that the total cost of storage is declining is wrong. And this simple calculation does not include ancillary storage costs such as staffing; data backup; and confirmation that the data collected are accurate, useful and clean.
This growth in storage and its management is placing a growing burden on all businesses — a hidden tax that is ever increasing. However, this is a tax that businesses can do something about. They can delete a significant percentage of their expensive-to-store data.
Unfortunately, while everybody is storing more data, very few are deleting any. Call it data hoarding.
Data Hoarding: Sense and Nonsense
Not all data that businesses collect are useful. Indeed, as the enterprise's haystack of data climbs ever higher, businesses often do not know what data they possess. Much of the information may be — and frequently is — junk, and data analysts waste time working with this junk, finding spurious patterns within it, thus hindering the company's decisionmaking capabilities while incurring needless costs.
Why do businesses collect and store more data than they are able to process and use? One reason is Big Data hype and the vague belief that more is better — that somewhere in that ever-growing haystack is a golden needle that will produce new insight and generate additional revenues. This, however, is not a business strategy; it is a business wish.
Another reason businesses store data is fear of the possible legal consequences that may arise from deleting information. U.S. Securities and Exchange Commission regulations, for instance, demand that brokers and dealers retain all client account information for six years and copies of all reports requested or required by regulators for three years. Regulations such as these encourage data hoarding... download the full white paper below to continue reading.