Cold Storage Heats Up

June 2nd, 2022 by By Tim Sherbak, Enterprise Products and Solutions, Quantum Corp.

Businesses in all sectors – healthcare, transportation, manufacturing, media and more – are generating more unstructured data than ever before. In fact, unstructured data is expected to make up 80-90% of all global data by 2025. Digital transformation of entire industries is at the center of this trend as organizations seek novel ways to provide value and become more competitive through data-driven initiatives. The use of connected devices, sensors, and cameras – each of which can generate terabytes of data a day, are driving storage needs through the roof. Plus, with the adoption of AI and ML applications, businesses are finding real value in these data sets, encouraging them to collect and preserve more of this data.

Classically, much of the data organizations retain has been for compliance reasons, particularly in highly regulated industries like financial services, healthcare, telecom, and government, where data must be preserved anywhere from years to decades. These requirements have not gone away but are now dwarfed by the growth of digital data sources. The volume and incessant growth of these new data sources now drive a new set of data management and archival needs. Traditional storage architectures are challenged to maintain performance at scale and simply can’t cope with this growth. And with the scale and active use of this data, cloud-based solutions become expensive quickly.

Evolving object storage architectures and as-a-service offerings
As the basis of both on-premises and cloud-based archives, it is certainly the case that object storage has reached mainstream use. Data elements are stored in individual units called objects, combined with a unique identifier and relevant metadata. It’s therefore possible for businesses to consolidate all their unstructured data into a single data lake. In contrast to legacy file storage systems and appliances – which rely on a hierarchical structure – object storage offers easily accessible, efficient unstructured data storage at any scale.

Object storage is also well-recognized for its ability to improve data durability, security, and availability. To protect data, object storage systems use erasure coding, a unique technique that overcome the limitations of classic RAID-based data protection. In short, it works by splitting each object into multiple pieces, known as shards. Data shards and calculated parity shards are encrypted and distributed across the underlying hardware infrastructure, even across geographies – ensuring that data isn’t lost and remains accessible. Plus, every data access is authenticated, and all data access is logged.

Scalable object-based storage archives were pioneered by some of the world’s largest cloud solution providers. With emerging architectures and services, a new generation of warm and cold object storage solutions are now deployable within an organization’s own data center or colocation facility. New object storage offerings, featuring RAIL (Redundant Array of Independent Libraries) tape architectures, multi-dimensional erasure coding, and clever disk-based data staging address the need for both online data accessibility and low-cost long-term storage at petabyte to exabyte scale.

Performance, durability, and storage efficiency
For colder data storage tiers, tape continues to be the preferred medium due to its technology maturity, continued low-cost leadership, ultra-low power consumption, and long-term durability. Once the Holy Grail of large-scale tape deployments, RAIL (Redundant Array of Independent Libraries) architectures are now available with multi-dimensional erasure coding technologies. These new erasure coding algorithms are optimized for tape library characteristics (high streaming performance, slow random-access performance), simultaneously maximizing accessibility, performance, durability, and storage efficiency of tape-based data. With superior data protection and durability for long term retention, these solutions also reduce storage costs versus multi-copy and cloud-based solutions.

High-performance, object-based access to tape-based data allows organizations to affordably expand the use of AI and deep learning with simple staging of tape-based data sets for model recalibration, additional analysis, and data re-use. And with in-house access to scalable cloud computing infrastructures such as AWS Outposts, organizations can perform rich analysis on-site, burst to the cloud as necessary, maintain their data within in-house security perimeters, and meet data residency requirements.

Blurring the distinction between warm and cold data
As opposed to using tape exclusively as an offline resource in offsite facilities, tape-based resources are now a fully online, accessible tier in modern object storage platforms. Multi-tiered object storage solutions meet the needs of large enterprises, cloud-scale solution providers, research facilities, and government agencies to retrieve data easily and affordably within minutes, not hours or days, and without data access fees.

Over time, organizations that figure out how to optimally process, analyze, store, and preserve the massive amounts of data they generate will no doubt gain an advantage over their competitors. By ensuring data is available, discoverable, and safe, businesses can move more quickly, extract more value, and identify untapped opportunities.

Alliance Members & Sponsors