Open Source Active Archive Software – Enabling Large Scale Archiving Projects

One major challenge for active archiving solutions is to provide a platform to manage the long term data retention requirements for archiving projects. Unlike solutions built to address backup requirements, archiving requires a rational approach to manage the life cycle of data across evolving hardware and software platforms, in a cost effective and secure manner. Since the timeframe to maintain data extends into multiple decades (and beyond), an approach to manage the forward migration process of data while not affecting availability is required. Previous blogs have provided extensive examples of techniques to manage this, including cost effective media copies utilizing tape for long term retention, automated media to media copies and utilization of LTO’s backward read compatibilities to move data to new generations with very low impact on system cost. 

The Right Storage for TV Media Archiving

One of the most interesting meetings I recently attended was hosted by the EBU (European Broadcast Union).  The topics revolved around how studios should store and archive their TV creations for decades, if not centuries.  The meeting was very interesting because there are 3 things that set these Media folks apart from most archiving customers.

Active Archiving Connects the Storage Community to Drive Innovation


Big Data. What’s implied by that term can be dependent on whom you ask. From the proliferation of rich detail captured by businesses, to the amount of metadata that supports information gathered, Big Data always means Bigger Storage. IT managers may find this data explosion sparking a challenge for data storage, capacity need expanding far faster than budgetary allotments. Especially with longer retention requirements and more stringent regulatory compliance standards, the volumes of fixed content that needs to be securely maintained is skyrocketing. The widespread digitization of files in sectors such as media and entertainment, healthcare and legal also drive the need for scalable, cost-effective storage. Traditionally tape has been employed for bulk storage, but an offline tape library doesn’t enable rapid data access for efficient workflows. More than ever, the concept of data tiering with an active archive is becoming standard for enterprise organizations that have mountains of data to maintain.

First Active Archive Certified Architect Training Complete

Last week the Alliance hosted its first Active Archive Certified Architect (AACA) training in Chicago and it was a great success! Attendees included a great group of value added resellers (DataSpan, Sanity Solutions, GPL Technologies, P1 Technologies, America Tecnologia, and DSN Group), plus an end user. The two-day training course focused on active archive solution differentiators, architectural specifics, design considerations, and product training. Attendees report they are already leveraging the momentum gained from the training. Here are some attendee comments:

Preserve Data for Decades with an Active Archive

QStar Technologies, along with many of the Active Archive Alliance members were at NAB (National Association of Broadcasters) Show in Las Vegas a few weeks ago. 

I had the pleasure of speaking with a number of users in the Media and Entertainment world, some small and some very large. Almost universally they expressed the need to preserve their digital content for years and decades, yet all are looking for a “migration free” archive, which does not exist. All technology has a useful life, after which it becomes prohibitively expensive to continue operating. The objective is to migrate to a new technology before this occurs, so designing in migration is imperative when creating an archive.

Active Archive vs. Archive

Active archive is not synonymous with archive. It is fair to say that all active archives are archives; however it is not true that all archives are active archives.

Active Archive Alliance Celebrates Two Years of Raising Market Awareness

It's hard to believe that it has been two years since the Active Archive Alliance (Alliance) was formed. A lot has happened since that time, and the storage industry has begun to quickly recognize the benefits of combining the best of software, tape and disk for a more efficient and functional storage strategy.

Active Archive In the Cloud

It seems that the subject of archiving is getting a lot of press these days as enterprises of all shapes and sizes are facing a deluge of data and content which is being generated by new technologies and applications. This is not a bad thing per se as there may be value to be had from this data and content in the form of business analytics or monetization of the content at some point in the future. The problem comes in the cost management of this data/content effectively over time. With government regulations and legal compliance dictating long-term retention periods, and with the future value potential of certain data, much of the data has to be archived indefinitely.

TCO Breakdown for Long-Term Archive

New technology, along with increasing data retention requirements, fuel the demand for new archive solutions that offer cost-effective capacity and uncompromised access to the data. Finding the most cost-effective, reliable and scalable way to store long-term data is essential for today’s enterprise and is exactly what you get with an active archive solution. Mission-critical data may live on high-performance disk and with everything else residing on SATA and tape, maintaining the accessibility to everything makes an active archive the clear answer for long-term, scalable storage. By looking at the infogrpahic below, you can see that the TCO benefits make using an active archive the obvious choice for cost-effective storage. 

Big Data Should Not Equal Big Insights

I was at a storage event a few weeks ago where every panel, keynote speaker, and workshop presenter mentioned big data throughout their presentations. The part I found most amusing was how they all seemed to have a different definition of what big data was.  For some, it was big unstructured datasets of every kind that were growing at an increasingly alarming rate. To others it was the data that was outgrowing their databases, and to others still, it was complex analytics of structured and unstructured data.  It reminded me of how early on, and in some cases still today, “the cloud” had a nebulous definition that required you to ask, “When you say cloud, what exactly is your definition?”

I decided then and there that whenever I talk about big data, I would start by giving my definition so people understand what part of big data I’m addressing.  So here we go.