How AI Can Add Value to Active Archives
As time advances, active archives will grow far beyond their current dimensions. Thus, their capabilities must continue to advance to ensure they serve market needs. This is particularly true when it comes to the capabilities being unleashed by AI.
It is one thing to store a lot of data, another to be able to access and retrieve it rapidly, and something else entirely to be able to search a vast repository accurately and speedily. W. Curtis Preston, Technical Evangelist at S2|DATA, made this point at the recent Active Archive Alliance Video Conference, AI Needs Active Archive.
“We need to greatly improve the searchability of active archives,” he said.
Preston laid out some of the challenges active archives face as data volumes mushroom and AI use cases evolve:
*Maintaining full accessibility to archived, legally discoverable content
*Difficulties in searching through many years of documents, emails, and texts
*Balancing long-term storage with immediate retrieval needs
*Extracting value from historical communications
“Heightened search can transform static archives into searchable knowledge bases,” said Preston.
In the legal arena, for example, this would allow legal teams to easily access historical data stored in active archives. Further, it would help organizations to maintain compliance while enhancing accessibility. For some use cases, archived content could be organized by subject matter to speed up the retrieval process.
The achievement of full-text search across all archived documents would require advanced filtering to pinpoint specific archived information, intelligent tagging for historical content organization, and intelligent use of metadata in the search process. This should also be done while maintaining the security and immutability of archived materials.
“Such capabilities would make it possible to search all documents containing a specific phrase or all emails from a person of interest,” said Preston. “S2|DATA provides these capabilities today with Legacy Email Archive Solution (LEARai).”
LEARai applies AI to finding, retrieving, and retiring legacy emails easily, securely, and on demand. It is purpose-built to support legal, compliance, IT, security, records management, and risk by providing transparency to efficiently review and analyze email, legacy data, and SMS communications.
Further, the incorporation of AI into search compounds its value. It would open the door to features such as:
*Automated categorization of historical communications.
*AI-driven pattern recognition across archived documents.
*Intelligent search that understands context, not just keywords.
*Predictive analytics to surface relevant archived content
*Machine learning tools to bring about improvements as an archive grows.
*Natural language processing for conversation analysis about all the data in an archive.
Beyond Search: AI and Active Archives
Beyond search capabilities, AI offers further benefits to active archives. To achieve these benefits, David Boland, Vice President of Cloud Strategy at Wasabi, stressed that active archives have to be fully integrated into the overall AI data pipeline. This requires a concerted effort among the various stakeholders throughout the active archive and storage vendor world to execute the vision of AI-enabled storage.
Boland offered an example to highlight the potential. One customer took advantage of a Wasabi HDD and SDD solution with AI capabilities. This organization had 6 PB of data consisting of 30 billion images. Every 18 months, it created a new model and fine-tuned the overall results.
“The goal was to make each succeeding model more effective and less expensive while providing better inference performance,” said Boland. “We also had to store all raw data and archive each retired model for compliance. Additionally, the customer demanded a scalable S3 architecture and high availability.”
That these demands could be satisfied by an active archive design highlights how storage and archiving must evolve to keep pace with AI. Modern active archive architectures provide scalability, availability, security, high performance, resilience, and simplicity. These features must be maintained and further enhanced to fully serve the requirements of AI storage.