What is so Special About Metadata?

A cognitive strategy to manage the data explosion

By Floyd Christofferson

When researching ways to contain rising storage costs and reduce the complexity of heterogeneous storage environments, it is natural to look at storage solutions for ways to solve these problems. Data lives on storage, so it seems reasonable to assume that the answers to managing the explosion of data would be found in various storage options.

The storage industry is naturally focused on storage-centric answers to the problem of data management. To a hammer, everything looks like a nail. But even the best storage products cannot do this alone. Storage-centric solutions simply do not have the intelligence about the data they store, nor were they designed to work across multi-vendor storage types, different file systems, or protocols.

The good news is all digital files contain multiple types of metadata that can drive a data-centric solution to the problem, and in the process bridge incompatible storage types and use cases. Metadata is literally data about the data. Think of it as a roadmap that gives you a bird’s eye view of everything about your data, and which can drive data management policies that transcend the storage layer.

As an example, data-aware management solutions can leverage the intelligence derived from multiple types of metadata to pro-actively plan for and implement storage optimization, data protection, workflow automation, business continuity, and other tasks.

There are many different types of metadata, starting with file system metadata, such as file name, create time, access time, modify time, etc. But most file types also include additional rich metadata in headers, such as geospatial coordinates or other information that can drive workflow policies. And then there is external metadata which organizations may have accumulated, which may live in other databases or records, such as project-related tags, or other information about their data that captures business value, retention policies and more.

As all of these metadata types are coalesced into an aggregated environment, a metadata-centric solution becomes data-aware, or intelligent enough to automate data and storage management without needing to alter the underlying storage infrastructure.  

This is also crucial for implementing an active archive strategy in a heterogeneous environment. Metadata-driven policies can move data anywhere, on any storage type or location. So the active archive can truly be universally accessible by any authorized user.

So rather than trying to physically normalize all the data at the infrastructure level to overcome silos, a data-centric strategy does this in metadata to enable global management across all storage types, file systems, and locations.  And as policies or use cases evolve over time, metadata-driven strategies ensure that the data lifecycle requirements can be implemented regardless of the storage types existing today.

In this way, metadata can drive data placement policies to virtualize any storage type including archive. So the active archive can be online and accessible for all users, whether it is on disk, tape, cloud, or a remote site. Such metadata-enabled systems also enable users to search on any metadata fragment, to find the data whether in active archive, or a disaster recovery site, or any other storage tier anywhere.

Data is the lifeblood of every organization; but protection and accessibility also need to be cost effective and flexible. By looking at data management solutions that can leverage the power of metadata, a whole new horizon of possibilities opens up to customers to manage any data, on any storage anywhere.