The legal industry’s AI landscape, part 2 : document work

9 min readApr 26, 2017

It’s been a few weeks since part 1 of my series on artificial intelligence in the legal industry. If only AI could improve my writing habits. As promised, in this post I’ll share more detail on the first of the four AI categories I introduced: document work. This is the broadest, busiest, and best understood category in the legal industry today when it comes to application of artificial intelligence(ish) technologies, and I break it down further into three subcategories of usage:

Electronic discovery: this is where real use of AI technology began in the legal industry, growing out of document search software used to accelerate the review and classification of increasing volumes of digital information during the discovery process in litigation. Less than 10 years ago, terms like “predictive coding” and “technology assisted review” where little more than buzzwords. Today, they are the standard for e-discovery and early case assessment.

Contract review and analysis: similar in many ways to e-discovery tools, tools in this category are used to search and analyze large volumes of contracts (or other legal documents) for due diligence, compliance audits or to support new systems or processes. Some tools extract information such as terms, provisions, parties or specific clauses. They effectively transform volumes of unstructured textual content into structured information that can then be further analyzed, compared or used in any number of ways downstream.

Document drafting: Though not all strictly using artificial intelligence, I include this subcategory to represent new tools that do. By adding document analysis capabilities similar to those in the other subcategories, these tools are becoming smarter at guiding or assisting legal professionals and consumers while drafting new documents.

While these categories represent different areas of legal work performed by legal professionals with varying skill levels, they share one thing in common: the ability to read, understand and in some cases, to write documents. This is an area where AI has progressed significantly in the last several years, and where I believe we’ll see increasingly significant impact on the legal profession.

What’s so intelligent about these tools?

I was tempted to expand this series and go “under the covers”, explaining and perhaps demystifying the technologies that make them seemingly intelligent. But that is a deep, dark rabbit hole we’ll explore another day. Instead, I’ll summarize in general terms and spare you the techie details. If you are desperate for some tech talk, I’ll point to a few of my favorite resources with more detail.

Generally speaking, these document-related tools use technology from the broad field of natural language processing, or NLP. NLP can be described as a branch of AI that deals with analyzing, understanding or generating written or spoken languages that humans use naturally. When you ask Siri for directions or Alexa for the latest news, NLP technology is being used to understand your question and to respond appropriately; or when you use Google, behind the scenes NLP is grinding away to parse and understand your query to serve up the best answer. Try asking Google what the weather in Madrid will be this weekend and you’ll see that a lot more is going on than listing search results. All thanks to advancements in NLP.

NLP can also be broken down further into sub-fields such as named entity recognition, sentiment analysis, classification and topical categorization, and many more; each of which utilize a variety of complex algorithms, mathematical models and increasingly machine learning (ML). But it all amounts to the same thing: programming computers to read, listen to, understand and respond to human language like humans do… hence, artificial intelligence. If you are brave enough to dig deeper, check out Stanford’s video series from their NLP course or Google’s video series on machine learning.

I’ll also mention that NLP and ML tools and resources are generally more accessible today than many people think. Big platform companies such as Google, IBM, Microsoft, Amazon and even Thomson Reuters each offer readily available AI tools, there is a robust open source community and, of course, great resources from academia even if a bit, well, academic. Follow the links above or see Stanford’s NLP group or Apache OpenNLP for more NLP geek-ness.

How are they actually used?

No offense intended, but I’m quite certain that many people who utter the phase “artificial intelligence” have little idea what it means or what AI actually does. I’m not suggesting this is due to a lack of natural intelligence. Clearly AI is a broad, complex and often obscure field and it’s challenging to distinguish fake news from real these days.

So, in an effort to make things practical I’ll expand on how AI technology is applied in the field of electronic discovery today.

As many of my readers know, electronic discovery refers to the process in which electronically stored information, or ESI, is sought, located, secured, and searched with the intent of using it as evidence in a civil or criminal legal case. It can be a complex and costly process, involving many stages and often enormous volumes of information, electronic or otherwise.

There are e-discovery tools used today to support e-discovery that leverage AI to make the process much more efficient and, in many cases, more accurate. They are literally faster, better and cheaper than the human-only approach of slogging through huge volume of agreements, memos, emails and other documents on screen or in boxes stacked in conference rooms, trying to determine whether or not a particular document is relevant to the case at hand.

Most of these tools today work by first being “taught” by human reviewers that provide sample documents, sometimes called “seed” or “training sets”, that they’ve already classified or labeled as relevant to a case. Usually these tools require hundreds of sample documents to effectively “learn” from, but this is typically but a small fraction of the total documents to be reviewed.

Then the AI magic happens. The software uses the sample documents to identify patterns, words, phrases and other indicators — some of which that plain-old humans might not notice — to determine how subsequent documents should be classified. Then the balance of the documents are run through the software so that it can classify them with little to no human interaction, potentially eliminating hundreds or thousands of costly paralegal or associate hours. This ability is usually referred to as predictive coding, and the overall process of humans teaching and feeding the system while reviewing the results is referred to as technology assisted review, or TAR. Use of software that provides predictive coding and TAR is now considered as the standard in e-discovery efforts today.

Though this was only a brief overview of e-discovery and how new AI-powered tools have changed the process forever, I hope it helps you understand how advancements in AI have and will continue to impact how we work. If you want to learn more about e-discover and related technology, check out Exterro’s excellent overview or Everlaw’s description of their predictive coding capabilities.

Leveraging AI tools and technologies similar to those used in the discovery process, vendors are now tackling new challenges such as making due diligence more efficient or automating and accelerating review and analysis of large volumes of contracts and other documents. Like discovery, these activities have traditionally been time and resource-intensive — and expensive — which make them ripe for change. I suspect tools in this category and the various ways in which they can be used will grow at an accelerated pace over the next few years.

One sign of things to come may be JP Morgan’s Contract Intelligence program, or COIN. The company announced earlier this year that it has invested in software (that utilizes AI such as NLP and machine learning) that automatically interprets commercial-loan agreements that, until the project went online in June, consumed 360,000 hours of work each year by lawyers and loan officers. The software reviews documents in seconds, is less error-prone and “never asks for vacation”. Nor does it ask for billing rate increases, I imagine.

The future role of AI in document work

I regularly speak with tech companies and new startups inside and outside the industry, and I often learn about new products and technologies that may have application in the legal industry. But I try to keep it real and focus on what is applicable today, given current business models and industry dynamics. This is not an easy industry to change, after all.

That said, there are a few tools that fit this category that I think may find traction in the near future. From a technology perspective, NLP includes generation of text and documents as much as understanding written or spoken language, and also includes tools like Apple’s Siri, Amazon’s Alexa or numerous “chat bots” that are appearing in apps and all over the web. Considering these other uses, here are a few areas to keep an eye on:

Document drafting and content generation. There are interesting companies focused in the natural language generation space, such as Narrative Science based in Chicago, that have made significant progress over the last few years in automatically generating content or summaries of volumes of data. We’ll see more activity in this area as much of the work we do in this industry is in the form of document drafting. Tools that streamline or outright automate the creation of legal documents or advice are already appearing. Take LexCheck for example, which claims to use AI to check contracts for consistency and compliance as you draft them.

And this isn’t limited to legal documentation. Imagine being able to rely on software to produce relevant, high-quality pitch materials that are hyper-targeted to specific segments or client opportunities, reducing the typical fire drill moments the evening before a meeting or pitch?

Speech recognition. I don’t know about you, but I find myself using Siri on my iPhone more and more every day. I recently purchased an Amazon Echo and suddenly the whole family is asking Alexa to play music, tell us about the weather and update us on current events. The technology is noticeably improving, and I think it’s inevitable that we’ll someday see a legal Alexa sitting on lawyers’ desks taking orders and answering questions. Makes me wonder what the Alexa-Lawyer ratio will be compared to the Admin-Lawyer ratio in 5–10 years.

Legal chat bots. We’re already seeing this with companies like Do Not Pay, the self-proclaimed “world’s first robot lawyer” that helps you with parking tickets, delayed flights or trains and other legal matters without having to deal with quirky human lawyers at all. As NLP technology improves and more vendors train and connect it to legal and other information, we’ll see more applications where the user interface is an online conversation instead of buttons and forms.

If I offered these ideas even just a few years ago, I’d be laughed at. Not so much today. It’s hard to predict where AI is headed but it is clear the technology and computing power is progressing faster than ever before. In the end, though, understanding exactly what artificial intelligence is or how it works isn’t very relevant unless you are a data scientist or linguist. I’m sure most of us also don’t know exactly how advanced flight control systems operate in commercial airlines today, yet you board the plane nonetheless. But did you know that autopilot systems have been utilizing AI techniques for years?

Instead, ask yourself if artificial intelligence and new technologies will impact the legal industry. Clearly, they already are. And all parts of the legal ecosystem will be impacted at some point, in some way. We need to stay informed, but I encourage all of us to focus on problems to solve and not solutions looking for problems. And learning what problems artificially intelligent software are actually solving today is a good start.

The current landscape

“It’s difficult to create a map when the landscape around you is changing”, recently said a very smart person (OK, it was me). Yet, I will share my current list of vendors in this category below but with a disclaimer: the current list is based on my own knowledge and experience, and I limiting vendors to those that are legal industry focused, have a demonstrable connection to the AI topic or that, in my opinion, are ones to watch. Some vendors may be listed twice as they offer multiple products that span categories. I encourage you to take a look at these vendors and their products, read their literature, and better yet, ask them to show you proof of value.

While my list may not be 100% complete or even accurate per others’ opinions, you gotta start somewhere. I’m happy to update as needed and welcome your feedback.

Contract Review and Analytics

Beagle
Diligen
Exari
eBrevia
Kira
KM Standards
LawGeex
Legal Robot
Legal Sifter
Leverton
Luminance
RAVN
Recommind (OpenText)
Seal
Thought River

E-Discovery

Catalyst
Disco
Everlaw
Exterro
kCura
Recommind (OpenText)

Document Drafting

ContractExpress (Thomson Reuters)
HotDocs
KM Standards
LexCheck

In part 3, I’ll take a closer look at artificial intelligence in legal research. As always, I’d love your feedback on how I can improve my content on these topics.