2021 Data Storage and Active Archive Predictions and Trends

January 14th, 2021 by Meredith Bagnulo

Members of the Active Archive Alliance recently shared their 2021 predictions for data storage and active archives. Here are some of the top trends to watch:

Active Archive Requirements to Scale from Terabytes to Petabytes
Enterprises continue to create data at increasingly rapid rates due to, in part, the use of higher resolution video and graphics, along with significant growth in applications using “Internet of Things” and Artificial Intelligence. Much of this data is used extensively for relatively short periods of time. For example, for repeat runs as new methods become available necessitating data analysis to be updated.  These sets of data can each be multiple Petabytes of content. Scale-up, scale-out primary storage is now the norm for many, QStar sees the same requirements in Active Archive – where massive single name spaces are available through SMB, NFS and/or S3 interfaces – leveraging libraries containing tens or hundreds of tape or optical drives, with optional replication to on-prem or remote Cloud infrastructures. Node-based architecture will allow for performance and capacity to be scaled independently and, at the same time, increase Disaster Recovery options. QStar predicts a significant change in Active Archive requirements, surging from hundreds of Terabytes on average just five years ago to tens or even hundreds of Petabytes in the very near future. – Dave Thomson, SVP of Sales and Marketing, QStar

Data Management and Object Storage Software Increasingly Important for Active Archives
The continued exponential growth of data volumes, especially of unstructured data, will remain the dominant challenge for data storage. Intelligent data and storage management software will meet this challenge by integrating different storage technologies (flash, disk, tape) and architectures (file systems, object storage) so that data is stored in the most appropriate storage class according to its use and purpose. As data becomes inactive at an ever-increasing rate, an economical, technology-independent Active Archive Tier will be of central importance. In this context, software-based object storage, which supports HSM/ILM capabilities and mass storage technologies such as tape will become increasingly important. – Thomas Thalmann, CEO, PoINT Software & Systems GmbH

Active Archive Growth Will Be IT Lever to Invest in DevOPS
2020 changed the focus of IT shops across the spectrum, but digital data never stopped growing. Only 33% of IT budgets shrunk as a result of the 2020 challenges, most budgets aligned spending to mobile access and security enhancements. While some infrastructure changes are on hold, storage requirements continue to grow. Active Archives are the most efficient method of reducing cost of infrastructure without recognizing penalties. The growth created in cloud storage during the pandemic will lead to an increased spend on active archives by Hyperscale and Hyperscale-lite storage providers. At the same time, traditional data centers will continue to expand active archives as the only method to meet budgets while improving operational efficiency on production systems. – Shawn O. Brume Sc.D., Global Hypergrowth Storage OM, IBM

Ransomware Attacks Will Continue to Rise Increasing Data Security Risks
According to the FBI, the severity of ransomware attacks increased 47% in 2020 and there has been a 100% spike in the number of attacks since 2019. With so many organizations adopting remote working as a response to the pandemic, cyber criminals are exploiting security vulnerabilities and this trend will likely continue throughout 2021. Organizations will benefit from taking another look at their data protection strategies to ensure all endpoint devices are secure, backed up and recoverable. Additionally, active and inactive data throughout an organization’s data ecosystem could be vulnerable to ransomware attacks, making it necessary to review how and where data is protected with an increased emphasis on storing an air-gapped, offline gold copy of data. Active archives will play an increasingly important role with the ability to easily export copies of data to a secure offsite location for safekeeping. – Tara Holt, Senior Product Marketing Manager, Iron Mountain

Increasing Importance of Storage Lifecycle Management and Hybrid Cloud for Cost Control and Ransomware Protection
Data is the lifeblood of organizations today and knowing how to manage and store that data is proving to be one of the most challenging tasks for individuals managing data. The cost of secondary storage hardware has been declining year over year and is projected to continue to drop in price for the foreseeable future. Unfortunately, the cost of traditional data management software required to move data from the Primary Tier to lower-cost storage has been increasing. It’s not uncommon to see data management software cost more than the storage medium on a per-capacity basis, making current solutions cost multipliers rather than cost reducers. But, how do you leverage the affordability of secondary storage? 2021 will see more focus around storage lifecycle management and active archive solutions that will reduce the total cost of storage while leaving all data accessible, all the time. No longer will organizations need to fully redesign their storage infrastructure to implement a storage lifecycle management software solution.

The evolution of hybrid cloud in an active archive will be even more critical in 2021 as COVID’s impact on remote work pushed many organizations, right or wrong, to the cloud. Yet, the importance of on-premise data protection cannot be overstated, as data repositories swell and businesses rely even more heavily on the collection, usage, distribution and monetization of their data.  With cybercrime fast becoming a major threat to business operations, organizations are more vulnerable than ever and will do well to revisit and reinforce their digital preservation strategies. By leveraging intelligent data management software, organizations can achieve cost-effective data protection best practices by moving data off of expensive primary storage to a lower cost tier of storage, which includes storing multiple copies of data, in multiple geographies, on multiple mediums, including cloud, disk and tape, with at least one copy stored offsite and out of the network stream. – Betsy Doughty, Vice President of Corporate Marketing, Spectra Logic

100-Year Archives Will Require Intelligent Active Archive Software
Businesses spanning a range of industries increasingly hold data that may be retained in some digital format for 100 years or more making long-term retention necessary. These “100-year archives” combine capabilities of intelligent data management software and high-availability, scale-out hardware will be required to cope with possibly exabytes of archive data.  The most cost-effective solutions today for archive data use high-capacity tape libraries in local, cloud and remote locations. Hyperscale Data Centers will require advanced, easily scalable air-gapped tape architectures that can support erasure coding, geo-spreading with exascale capacities, extreme reliability, and ironclad cybersecurity protection. 100-year archives will require intelligent active archive software incorporating smart data movers, data classification and metadata capabilities, highly scalable tape libraries, Redundant Arrays of Independent Libraries (RAIL) architectures, Elasticsearch, erasure coding and geo-spreading of data across zones in different locations for higher fault-tolerance, redundancy and availability. – Eric Bassier, Senior Director, Product and Technical Marketing, Quantum

Active Archives Will Support Standards-Based Information Exchange in Healthcare
In 2021, availability of comprehensive telehealth services and access to medical records will be critical.  Technology will shape the virtual patient-doctor experience.  It will also enable patients to measure and monitor their own health through wearable devices, apps and patient portals that connect them to their medical records and care teams. U.S. governmental policy requiring the seamless flow of health information will call upon active archives storing historical patient records to readily support fluid, standards-based information exchange for the benefit of stakeholders across the healthcare ecosystem. – Shannon Larkin, Vice President Marketing & Business Development, Harmony Healthcare IT

2021 Will Be Focused on Cost Containment, Analytics and Security
As we head into 2021 and the post COVID-19 digital economy, IT leaders will be focused on cost, data analytics and security.  Cost containment of rapidly growing unstructured data will be critical. Active archiving will be mandatory as companies recognize the increasing value of data and the need to cost-effectively store ever increasing volumes for longer periods of time. Ready access to that data via an active archive will support AI based analytics to derive competitive advantage. Data security will be an increasing priority in an era of escalating cybercrime.

Intelligent active archive solutions will be needed to classify data and move it automatically according to user defined policy from expensive tiers of storage to economy tiers of storage while maintaining ease of access and providing the benefit of air gap security. – Rich Gadomski, Head of Tape Evangelism, FUJIFILM Recording Media USA, Inc.

Data-Centric File Management Will Help Contain Costs in 2021
IT organizations are increasingly moving to a data-centric model to manage files across multi-vendor on-premises and cloud storage platforms, a trend that will grow significantly in 2021.

The datacenter budget tightening in 2020 in part caused by the impact of COVID-19 has increased the need for IT organizations to leverage intelligent data management technologies to automate multi-tier storage architectures to defer expensive storage upgrades, and get more life out of existing primary storage. This trend is driven by the need to reduce the load on expensive primary storage types, but also is possible now due to the emergence of technologies that enable automation based upon data intelligence and business-processes, which can seamlessly manage data across any storage type.

Data-centric approaches leverage metadata from files and other workflow driven triggers to automate data movement, migration, data protection, active archiving and other use cases across any on-prem or cloud storage type. As opposed to traditional storage-centric approaches, where data can often get stranded in a single vendor silo, data-centric automation for policy-based data placement enables IT planners more flexibility to contain costs by offloading primary storage without interrupting user access, or adding complexity. This trend also enables IT organizations greater flexibility in deferring primary storage upgrades or replacement, by seamlessly shifting data to lower cost on-premises and cloud storage choices transparently. – Floyd Christofferson, CEO, StrongBox Data Solutions

Healthcare Industry to Rely More on Active Archive Solutions
As regulatory oversight increases at the federal level, the archive space, and specifically active archiving will get more notice across the healthcare industry for the solutions it provides to ensure compliance. With its higher visibility, archive products may become areas of increased interest for hackers. Ransomware plays against the legacy protected health information (PHI) stored within these systems could become an issue, highlighting the importance of air-gap protection as a component of an active archive solution.

With the growing focus on interoperability, the conversation will expand to include archived data among the requirements included – especially as it relates to provider-to-provider data sharing and a patient’s access to and ability to share their own data.

Lastly, standardizing the use of Enterprise Master Patient Identifiers (eMPI) and/or unique identifiers as a way to meet regulatory requirements presents opportunity for vendors in that space.- Kel Pults, DHA, MSN, RN, Chief Clinical Officer, MediQuant

Active Archives Will Help Solve Difficulties in Long Term Data Management
The volume of unstructured or disconnected data from usable applications will continue to grow, fueled by the steady increase in highly regulated industries that will continue to make decision-making difficult on unidentified data of all categories. A potential increase in litigation resulting from the economic woes for companies resulting from the COVID-19 pandemic may further the preservation demands on archived data, adding to the difficulties in long term data management. One driver for change though is the increase and implementations of data privacy regulations such as the GDPR, CCPA, and others looming that will serve as a catalyst for big companies to tackle the management of their unstructured or disconnected archive data. This trend will give rise to greater demand for intelligent active archive solutions. – Brendan Sullivan, CEO, SullivanStrickler

Increased Data Management Complexity and Threats Will Call for Best Practices Assessments
2021 will see accelerated data management trends including the increasing use of active archiving in hybrid scenarios, a mix of on-premise and cloud infrastructure accelerated by the impacts of the COVID crisis. Many organizations are struggling to cope with new usages, trying to leverage their current archiving systems to embrace new workflows such as remote collaboration or content distribution.

In 2021, not only will the file based unstructured data trend continue to fuel usages such as IoT, genomics, simulation and so on, it will become even more predominant. This will require active archiving and data management solutions to not only prove they can scale in volume but be flexible enough to embrace multiple complex scenarios in heterogeneous environments across multiple sites.

However, too many organizations when faced with rapid evolutions are using rigid solutions or making choices essentially based on cost.  Because of cost-consciousness, security becomes a secondary matter. The number of cyber-attacks targeting backup and archive data will only increase in 2021.   Respecting best practices including multiple copies on different air gapped technologies and destinations, is a call to establish proper assessments of an organization’s vulnerabilities.  Increasing cyber threats are imposing a critical impact to business continuity and long-term asset preservation and this impact extends across organization’s workflows from preparation and planning to the execution of archive protection tasks.  Industry experts are convinced this will be proven again in 2021. – Ferhat Kaddour, Vice President Sales & Alliances, Atempo

Time, Money and Risk Will Continue to Play Critical Role in Data Movement
In the wake of COVID-19, enabling remote work has required IT teams to rapidly lean into cloud technologies to keep their businesses operating smoothly. Migrations will continue to be complicated and since time, money and risk will continue to play an ever-important role in data movement, customers will be looking to providers to offer them intelligent services and active archive applications that leverage the cloud. With this change, the fear of the cloud will evaporate. In fact, according to Forbes, by 2025, 49% of the world’s stored data will reside in public cloud environments.

Along with this shift to the cloud, Artificial intelligence (AI) will automatically unify data and Data-driven automation for AI operations will become central to IT strategies. Organizations will look to move data based off operational or situational intelligence. – Brian Morsch, Vice President, Product Management, Integrated Media Technologies, Inc.

Active Archive to Become Unified and Data-Centric
When it comes to an active archive, some people talk about on-prem, some people talk about cloud or even multi-cloud, and some people talk about hybrid, but what everyone needs is unified active archive platforms. The user doesn’t care where the data is so long as it is quickly accessible when they want it and where they want it. Unified deployment architectures are the antipathy of a piecemeal approach and will see far greater prominence once the pandemic “work from home” edict has died down and “work from anywhere (*even from the office!)” takes off.

Analytics are also likely to become increasingly important for managing active archives. As it becomes increasingly difficult to manage disparate work forces and storage systems alike managers will want to take back control in order to make informed decisions. Analytics can pull together clear overviews. In the case of an active archive, what is being stored where and for how long? How long is data taking to transfer and how much is that holding up production? Is data being kept on the wrong storage tier? Analytics have never been more important than during this time and will only become more of a focus during 2021. – Laura Light, Marketing Manager, Object Matrix

New Storage Technologies Enhancements Will Emerge to Support Cost-Effective Active Archives
In 2021, we will see greater adoption of next-generation disk technologies and platforms that enable both better TCO and accessibility for active archive solutions. We estimate that data is growing at ~30% CAGR, expected to reach 150ZB stored by 2025. This insatiable growth, driven by humans and machines, is creating an explosion in long-term data retention and archive challenges like never before. Where do we put all of this data? How do we cost-effectively store it and maintain long-term access with the lowest TCO? In an age where capturing, storing and extracting value from this influx of data is critical to success, new solutions that support active archive systems must emerge. With advancements in HDD technology, including new data placement technologies, higher areal densities, mechanical innovations, intelligent data storage, and new materials innovations, HDD-based solutions will emerge enabling new capacity points, and unprecedented economics and TCO at scale for active archive tiers. In 2021, we will see a new generation of storage device based on host-managed shingled magnetic recording (SMR) HDD technology emerge, giving way to platforms specifically designed for colder storage tiers making long-term data storage more economical and accessible for data at scale either on premises or in the cloud for decades to come. – Scott Hamilton, Senior Director, Data Center Platforms, Western Digital

Increasing Adoption of Active Archives Based on Data Tape Libraries

It is hard to beat the cost per TB of data tape libraries for organizations that have large active archives, whether those organizations are public cloud providers or users with large volumes of data. This is due to the low cost per TB of the cartridges themselves and low system power requirements. It means that tape storage is often an important element, alongside disk and management software, in high-capacity active archive systems.

Many public cloud providers have introduced object storage with very low-cost archive tiers that take perhaps an hour or more to restore a file. Often these providers use data tape as part of the storage mix.  We anticipate rapid growth in the use of low-cost archive tier cloud storage. However, many potential users of cloud archive tiers are put off by the very high cost of egress fees, not just for routine restores, but the potentially massive cost if they ever want to move their content to another provider. On-premises tape-based active archives are typically not subject to egress fees and we will continue to see growing demand for this class of storage. On-premises archive solutions that offer an S3 interface, allowing the archive to be securely shared by remote users and other facilities, will be especially attractive. – Philip Storey, CEO, XenData


Alliance Members & Sponsors