Document review is one of the most expensive, time-consuming and important tasks in law. When faced with volumes of unstructured data to review in a very tight timeframe, lawyers often lean heavily on outsourced eDiscovery partnerships to provide knowledge and manpower to streamline the process. Typically, an outsourced eDiscovery partner will first seek to pull relevant documents using key word searches. Indeed, the vast majority of matters in the analysis phase of eDiscovery are easily handled via intelligent application of key word searches. However, there are times when busy lawyers must urgently get to the core of a huge data set to truly understand what needs to be reviewed and start reviewing.
Relativity, a long-time leader in the eDiscovery field, developed Relativity Analytics to incorporate visual data analysis and active learning technology to provide structural and conceptual searching functionality that works off ideas and concepts. Rather than matching specific key words or character strings as done by traditional searches, Relativity Analytics identifies critical documents in a case by searching and organizing them using a predetermined index to identify similar ideas or concepts within a document set. Results depend on how and where similar ideas and concepts intersect.
For reference, Relativity Analytics can be broken down into two subsets: structured analytics and conceptual analytics. Structured analytics operations analyze text to identify the similarities and differences between the documents in a set. After using structured data analytics to group documents, Relativity can run conceptual analytics to identify conceptual relationships present within them. For instance, a project manager or review team can identify which topics contain certain issues of interest, which contain similar concepts, and/or which contain various permutations of a given term.
Using structured analytics, project managers can quickly assess and organize a large, unfamiliar set of documents to shorten a review team’s review time, improve coding consistency, optimize batch set creation, and improve Analytics indexes. Common structured analytics tasks include email threading, textual near duplicate identification and language identification:
Email threading - email Threading greatly reduces the time and complexity of reviewing emails by gathering all forwards, replies, and reply-all messages together. Email threading identifies email relationships, and then extracts and normalizes email metadata. Email relationships identified by email threading include:
Textual near duplicate identification - While textual near duplicate identification is simple to understand, the implementation is very complex and relies on several optimizations so that results can be delivered in a reasonable amount of time. The following is a simplified explanation of this process:
Language identification - Examines the extracted text of each document to determine the primary language and up to two secondary languages present. This allows you to see how many languages are present in the collection, and the percentages of each language by document. The project manager can then easily separate documents by language and batch out files to native speakers for review. The operation analyzes each document for the following qualities to determine whether it contains a known language:
Using conceptual analytics helps organize and assess the semantic content of large, diverse and/or unknown sets of documents. Unlike structured analytics, which relies on the specific structure of the content, conceptual analytics focuses on related concepts within documents, even if they don’t share the same key terms and phrases. Common features of conceptual analytics are clustering and active learning, which can cut down on review time by more quickly assessing your document set.
Clustering - Analytics uses clustering to create groups of conceptually similar documents. With clusters, project managers can identify conceptual groups in a workspace or subset of documents using an existing Analytics index. Unlike categorization, clustering doesn’t require much user input. Clusters can be created based on selected documents without requiring example documents or category definitions.
Active Learning - Active Learning is an application that runs continuously updated cycles of documents for review, based on review strategy. The advantages of Active Learning include real-time intelligence, efficiency, flexibility and integration with all the power of the Relativity platform.
IST Management Services, Inc.
1341 Moreland Ave SE, Atlanta, GA 30316