One of the fundamental challenges in web archival is the complexity of the web. Unlike paper archival, or even email archival, where the formats are basically the same — web archival, including social media platforms, are comprised of multiple, sometimes dozens, of technologies and modes of operation.
When the Lab approached this problem, the solution wasn’t to try to build a technology that auto-detected and solved every variation before it happens – that’s not a sustainable, viable or necessary solution.
SmartCrawl™, released into Cloud Preservation today, is an architecture that provides a framework to custom configure multiple options of how an individual data source (a “feed” in CP) is the caputred (a “crawl”).
This major release adds functionality to deliver highly customized, forensically sound, automated data capture from cloud-based content of all kinds, including the public web, private firewalled data, the social web, the mobile web, and B2B SaaS.
Combined this evolution is already delivering a tsunami of new data stores — necessary to unlocking value in the corporate environment and ultimately, to deepening and enriching the relationships brands are having with their customers.
By improving the quality of the preservation, we’re letting our customers do what they do best – focus on their customers – by answering the business challenges around managing this data.