To Active Archive or Not to Active Archive? – That is the Question
The actual quote from William Shakespeare’s Hamlet is “To be, or not to be, that is the question: Whether ’tis nobler in the mind to suffer the slings and arrows of outrageous fortune, or to take arms against a sea of troubles, and by opposing, end them.”
Is it “outrageous fortune” if data is lost (or at least not found) because there was a better strategy that you could have implemented but did not, or is it merely due to bad planning?
Today, many organizations still do not implement an active archive strategy, believing that using primary storage protected by Snapshots or Continuous Data Protection (CDP) is the optimal method to store and secure ALL their data. Some files continue to change frequently for a short period of time, while many more never change again. Snapshots and CDP are probably the best options for these critical and still changing files. However, in most organizations 80% or more of their data is static and non-changing. Rather than design a system for the majority of their data, they instead use overly expensive and cumbersome solutions designed to protect the minority of their data. In any and all recovery scenarios (single file, partial, or full restore), an administrator is needed to decide to which recovery point they need to go.
The second option is to use a backup product for archive. As all files are being stored periodically to backup technology (tape, cloud, or disk), many vendors suggest using the same idea for archiving data. The backup product has the same file a dozen times (or more), so remove the file from the primary file system and undertake a single file restore if that file is needed in the future. In most cases, administrators are again responsible for recovery of files, making the process onerous and extremely slow. This method, over time, leads to significant amounts of data on abandoned technology – without a simple mechanism to migrate to new (or different) technology.
A third method is to use non-active archiving. What I mean by this is an archive solution that is designed for a single archive technology (only tape or only cloud). Archives are often multi-decade stores and periodically need to be migrated to new technology. The same happens for primary storage (but more frequently) as new platforms replace older ones. Data migration services should be in-built for ALL archive products, and they should show pedigree in supporting the widest choice of archive technologies today. You never know what the “next best thing” is for archiving – and you need to know that your archive provider is likely to support it.
An active archive solution provides the following “active” elements:
- Users can “actively” find and access files themselves – without administrator intervention. Files can be retrieved in milliseconds (when using flash / disk), seconds (when using cloud), minutes (when using tape) or hours / days (when using deep archive cloud or offline tape). The performance is based on organizational or data set based requirements and budget.
- Archives stay “active” by automated migration from old to new; no data is left behind. Periodical migration should be planned, and the archive software should be able to perform these actions in background without affecting performance targets.
- Archive must be “actively” protected. As archived content is static and unchanging, the easiest methodology is to use WORM (write once read many) or Retention Management to store files in “Read Only” mode. This method prevents accidental or malicious deletions or overwrites. Encryption can / should also be used to protect data from unauthorized access.
- The number of copies of archived data and the technologies they are on must be “actively” managed. Active archives protect data by creating copies of the content. The number of copies and technology they are placed on can be designed to meet multiple performance and security requirements. If files are still actively being used, even though they are unchanging (for example, raw video footage during post-production), one copy should be on flash / fast disk, but other copies, for data security, can be on slower, lower-cost technology. After post-production is complete, the fast copy can be deleted. The same applies to many other workflows.
The above is summarized as “3-2-1 Archiving Best Practice”, where it is recommended that three copies are maintained (although a minimum of two is required). Copies should be stored on two different technologies; this prevents lock-in and potentially expensive migrations (although many organizations choose to use a single archive technology to reduce cost). One of the copies should be offsite, on offline media, or in the cloud; this is the disaster recovery copy.
Ideally, all copies should be archived at the same time, to prevent the “copy of a copy” issues, which can degrade data over time.
Is it time for you to “take arms against a sea of troubles” and protect your organization from the “slings and arrows of outrageous fortune”?
For further information, download this Active Archive – Innovative Data Storage Solutions White Paper.