According to the recent article, ‘Snapshots’ cannot accurately archive gov 2.0 content, says Navy official,” taking a snapshot or screen capture is an incomplete method for archival of Internet data.
We could not agree more. We would suggest that organizations need a complete, indexed, search-enabled, forensically sound archive of their social media and web-content. That is exactly what we designed Cloud Preservation to do.
Very simply, here’s what Cloud Preservation does and why we developed it this way:
1. Capture first page off-site links
- Capturing the first page of off-site links ensures a complete archive. The most important piece of evidence may be from a third-party site. For example, the YouTube video a Facebook post is linking to may be your “smoking gun,” but obviously there is not way to screen capture a video and print out that PDF to produce the evidence.
2. Capturing all metadata
- Preserving metadata, including time, author and outbound links is the only way to guarantee a complete and relevant archive. The time a tweet was sent out might be the most relevant piece of data.
3. Preserve embedded native files (PDF’s, Word Docs, Excel spreadsheets, PowerPoint presentations)
- Leaving out native files in preservation is a huge mistake, as our CEO Rakesh put it in his recent article in ARMA Information Magazine, “it’s akin to archiving e-mails without saving attachments.”
4. Real-time capture
- Capturing in real-time is especially important for social media. Real-time capture allows the user to preserve deleted information from social media sites such as incriminating or damaging tweets that were sent out and minutes later taken down upon realization of the consequences.
5. Pull data from the social media/websites API
- Making use of the available API (application programming interface) from social media properties like Facebook and Twitter is the only way to accurately preserve data. If you are a relatively active Facebook user you may have noticed when a page is refreshed new information may be displayed. This is based on a unique algorithm to show the user what they “want” to see. Utilizing the API allows for a complete capture rather than what Facebook “thinks” is relevant at the time.
For a more detailed look at how to accurately preserve social media see “10 Things to Know about Preserving Social Media” as published in the ARMA Information Management Magazine in the Sept/Oct Issue by our CEO Rakesh Madhava.