Answering the question, “What do you do?”

Invariably after I meet someone, one of the first questions after “what’s your name,” and perhaps questions about my family, is “what do you do?” In my quasi-rebellious youth, I might have answered “I do lots of things,” and regaled the new acquaintance with tales of independent film making, travel, surfing or other adventures. This was primarily because I was usually jobless, and saying, “uh, I’m unemployed” made for a conversation killer.

Those were the days…

Now I find myself struggling to answer that question, but not because of the lack of a career, but because my career is in a dynamic, esoteric field that few people outside of it understand. The quick answer to “what do you do?” would be that I’m a project manager at Nextpoint. But given that Nextpoint is a player in a quickly evolving industry, even that is a moving target.

What is the “Hotseat?”

In 2006 when I started at Nextpoint, the company specialized in providing support for trials, and also had a little web-based application that could be used to designate deposition testimony. We provided top-notch service on many high-stakes trials, creating demonstrative graphics that could be shown to a jury to help the attorneys get their points across. We would take the information, develop visual concepts, and create graphics to explain complex material in a simple visual way. I would often help out on the creative side of things, but my primary duty was that of a “hotseater.”

The hotseater sits in court with the trial team with all of the graphics, trial exhibits, and other multi-media on a computer (and a back-up computer – we hotseaters are a paranoid bunch). When attorneys ask questions of a witness they often need to show documents or graphics. That’s where we come in. The hotseater is asked to pull up documents or files to be displayed on monitors and projector screens throughout the courtroom. As they go through the documents, the hotseater will zoom in on important areas, highlight key text, and create other annotations to help pull the jury’s focus to the appropriate areas. It seems very simple, and yet it can be a terribly stressful endeavor. With the sheer volume of documents and the endless ways they could be asked for, a hotseater’s brain becomes saturated by information that must be recalled instantly, and often without warning. Hotseating is a live performance in which everyone in the courtroom is watching what you are doing, and if a document takes 10 seconds to appear on the screen, it seems like an eternity.

Another source of stress for the hotseater is the anxiety and uncertainty associated with ensuring all necessary documents are accounted for. Despite all the best efforts to gather every thinkable necessary document, invariably one will creep in there that had never been mentioned prior to that moment in court where the hotseater is scrambling to find the elusive document as the jury sits waiting. In the past I would get new items delivered to me in court via a USB drive or CD-R. However, if I thought I might need something, it was always difficult to get word to the appropriate person to deliver it. Now, more and more, courts are either allowing, or providing Internet access in the courtroom. The ability to email a paralegal or fellow Nextpointer to ask for a document relieves a great deal of the stress. But what happens when the file is too big to email? Enter Nextpoint’s Trial Cloud.

Trial Cloud, All Growed Up

Now remember earlier when I mentioned that back in 2006 we had a little web-based application that could be used to designate deposition testimony? Well that little application has grown into a ground-breaking, robust, SaaS platform that has become an indispensable tool in our trial support arsenal. With Trial Cloud, we have the ability to host all the documents that might be used in a trial in a fully indexed and searchable database. Since it is web-based, any changes or additions made by one of my colleagues back in the warroom (that’s what we trial support folks call our trial offices) is instantly accessible by me in court. Got a new 50mb PowerPoint file that I need ASAP? Just upload a new version and click the email notification, and I’ll be able to download it in court in moments. Is an attorney struggling to recall a document? Just pop in a few search terms and I can find it. Along with Trial Cloud, Nextpoint now offers a couple other products: Discovery Cloud (our native file processing and review tool) and Cloud Preservation (our social media and website archiving platform).

Now that I am also involved in training and support for our products, I have fundamentally changed my answer to “what do you do?” But even more so, it seems the question has changed to “how do you do it?”

Organizations have an increasing need to preserve data from websites and social media platforms due to a growing cadre of regulatory requirements, legislation such as Dodd-Frank and the Freedom of Information Act, and general e-discovery readiness.

While it may be desirable to treat web sites like other file types for the purposes of archiving, there are some critical differences inherent in the dynamic nature and unique architecture of the Internet that necessitate additional steps in order  to ensure a complete and accurate archive.

Nextpoint believes there are three core requirements in building a website archive for legal and compliance purposes.  First and foremost, the archive should include all original unaltered source files including HTML, images, video, CSS (style sheets), Javascript, linked files such as PDF’s, and any other data referenced or linked to on the site.

The second requirement is having the ability to search, review, produce, and generally utilize the content of an archived website. These two requirements are common to any archived data repository, but the unique architecture of the web introduces a third requirement.

To archive a web page in it’s entirety on a given day means including content from related pages as well as third party servers such as video providers like YouTube or social media streams from Twitter.  The Cloud Preservation platform offers a thorough approach that results in a viable strategy for tackling all three core preservation requirements.

Original, unaltered source files

First, source files must be preserved in their original, unaltered format. This means that it’s necessary to save all original files that were used to create the site or that were included on the site.  For example, Cloud Preservation captures the native HTML file (including source code, developer comments, etc.) representing the code that was running the website on any particular day as well as any native file types such as a video file or perhaps a PDF document or Excel worksheet if these were available for download from the site.

Intuitively it would seem that these source files make it possible to “go back in time” and browse a website in the exact format in which it was originally published, but in reality this is not the case.  There are numerous technical hurdles that prevent recreating the website at a point in time, including real-time data from third party servers and back-end data accessed via Javascript or AJAX.

Additionally, hyperlinks, which are core to the architecture of the Internet, would need to be modified to link to their respective target pages in the archive.  While some website archival strategies attempt to build and store a browse-able version of a website, they in fact have to modify the core source files by changing links and attempting to make static versions of dynamic website components.  This does not result in an archive of the original, unaltered source files.

Review, export, and production

Organizations have numerous and diverse needs for historical copies of websites largely centered around legal and regulatory compliance.  For these purposes, website data often needs to be searched and produced, printed, or exported in a usable format.  While Cloud Preservation stores all original source files and makes them easily available for export, it also creates an image rendering (similar to a screen capture) with a forensically sound timestamp for each page of a website and inserts all text from the page into a powerful search engine.

When data needs to be reviewed or produced to another party this image rendering is a much more practical way of providing the data than the original or modified HTML source files along with CSS, Javascript, and multimedia files for each requested web-page.  Cloud Preservation offers the capability to easily export images or PDF’s of the web pages making for more realistic and accurate representations of the website in a format that is widely accepted and preferred for legal productions and also includes meta-data files (or “load files”) that contain all related data and search text.

Third party data

Unlike most documents, websites are aggregations of content that is being provided by multiple live sources on the web.  For example, a web-page might show a video that is hosted on YouTube or Vimeo or perhaps it shows recent posts from a blog or recent updates from a Twitter feed.  For a financial organization, a website might display real-time loan rates or stock quotes all coming from a third party server.  Nearly all websites include hyperlinks to other sites which must also be archived.

In these cases, the data is not actually included in the underlying HTML source files but is brought in, via a technology called AJAX, directly to the browser from the third party.  As a result, there is potentially critical data that an organization would be unable to reproduce or render at a later date if needed. From a compliance standpoint, this is akin to saving an email without the attachment.

Cloud Preservation resolves this problem by capturing the image of how the page rendered in a browser at the time of archival along with the full-content and text of the page after render.  And for a good measure, a forensically sound timestamp is included on that rendering.  When you search for web pages in Cloud Preservation, the third party content WILL be included.  Additionally all linked pages and documents, including external sites and files such as PDF’s or Office documents, are captured providing a comprehensive view that includes not only what was on the website on a given day, but also a complete picture of third-party resources that were utilized.  These related links can be accessed under a tab on the image preview of any given page.  Clicking the link will take the user to archived version of the external data as it appeared that day, allowing the user to essentially “recreate” the navigation experience.

A Complete Approach to Preservation

The unique architecture and connectedness of the web means that if you want to browse a website exactly as it was at some point in the past, you would need more than an archival tool, you’d need a time machine.  That said, Cloud Preservation takes a complete approach that not only preserves original unaltered source files, but also preserves the entire visual experience along with all text and related content for each page.

