Last week we announced the unveiling of the Google Document Understanding AI (“DUAI“) platform.
But wait, there’s more!
Since last week we’ve discovered the Google Cloud Next ’19 conference video showcasing DUAI’s:
- high-level architecture;
- use cases, plus brief demos; and
- industry targeting.
It’s a 50-minute video so we’ve done our best to summarise what we saw, what we thought and our view of its likely impact and usage. Many questions remain, but it’s nonetheless a teaser of things to come!
Unsurprisingly, Google Document Understanding AI’s positioning is about turning unstructured data into structured data within the enterprise. This follows the well-trod narrative adopted by incumbents in this space.
By that we mean this narrative:
- the enterprise has huge volumes of unstructured data (see our piece explaining the difference between unstructured and structured data and why it matters);
- that unstructured data is growing exponentially;
- a lot of value is locked away inside that unstructured data;
- because it is unstructured the enterprise cannot efficiently access nor analyse that content; and
- as a result huge time and money is spent manually converting unstructured data into structured data.
In practical terms Google says it’s about extracting and classifying key data from within unstructured content, using that to automate workflows and uncover insights via search and analytics. Or in simple terms:
We love the simplicity of the above! Google goes on to explain this is made possible via their flexible Knowledge Graph, allowing thematically similar use cases across different industry verticals:
As we noted in our previous article, this ad copy / positioning, is very similar to the incumbents’ equivalent materials. This is unsurprising given the enormous and universal challenge that is managing and making sense of unstructured data. It also suggests Google will supply the bricks but not the builders / architects, i.e. leaving it up to users and their technical teams to build what they wish using the DUAI tooling.
Near the start, a high-level solution architecture diagram for Google Document Understanding AI is presented. As suspected in last week’s article, several components are borrowed from Google’s existing machine learning and search applications, components currently available via the Google Cloud Platform (“GCP“):
Also suggested are combinations of these tools to perform many of the same capabilities present in incumbent products.
- single / multi-label document classification
- entity extraction
- knowledge graphing
- semantic searching
- natural language queries
- consumption of common file types, e.g. typical enterprise file formats like Word, PDF etc
The most interesting feature is the natural language Q&A function. This type of feature is rarely executed, or executed well, in incumbent products despite being the type of feature users often assume such products provide.
The brief demo of this feature (see below) looked impressive and has potentially great application for due diligence, search and knowledge management within legal contexts, as the subsequent Iron Mountain use cases later suggest.
So onto the use cases. Note, however, we’ve not summarised every use case in the video but only the ones most relevant to legal in terms of their comparison to incumbent legal AI extraction applications. Enjoy!
What we saw
A basic knowledge management database, including 48 Wikipedia documents spanning various topics. The thrust of the demo shows natural language search and responses much like those available via Google web search.
The categories of available response to a search query are threefold:
Keyword matching. Rather basic, but has its purpose – e.g. searching for a particular phrase in a contract such as “Permitted Transaction”. Nothing startling, but necessary nonetheless.
02. Semantic Match
Answers match the meaning of the query, not just the keywords. In the demo, this is shown providing an answer to the query “very first steam engine” (the answer is 1712 – see the first screenshot below). In doing so it also includes a link to the sentence in the underlying document that provides the answer to this question.
Interestingly that type of feature (linking extracted answer to the source material) is common to all incumbent legal AI extraction tools.
When asked something semantically confusing – “Doctor Who is a person” – the semantic search remains able to return a meaningful answer, i.e. a brief description explaining who is Doctor Who. Pretty neat! (see the below screenshot)
03. Question Answering
Also shown is the ability to ask natural language questions such as “How many Doctor Who episodes have there been” and “When did Doctor Who first air“. The system answers these questions (863 and 1963 at 17:16 on the BBC respectively) with ease from data in the documents.
This type of searching could be very powerful in a legal context, e.g. to answer a question such as “When does the contract terminate“. However, often the answering data in the contract might be less clear than a fact by fact Wikipedia article, e.g. a termination provision expressed as conditional on some other information located elsewhere in the document vs. being a specific termination date.
What we liked
“Googley” type search in the context of enterprise legal knowledge management is always popular. Several of the incumbent search providers in legal already provide something similar to that shown in the demo, albeit not so much the Q&A and semantic match feature. Likewise, eDiscovery and legal research tools have these sorts of capabilities to different degrees.
However, if this type of application can be used to query contract data for due diligence, or just in time knowledge finding (e.g. during a negotiation when specific language is needed but unknown in location), then it could be super useful – especially the Q&A feature.
Imagine being able to ask a subset of documents a question such as “what is the mandatory prepayment amount” and have it return the values for each document!
Indeed as we shall explain below, Iron Mountain partnered with DUAI to do just that.
What we’d like to know
A pretty limited demo, so lots of unknowns. The main questions are how easily / quickly it would be to build and scale something like this across a legal organisation’s document management and other systems? Given this is all cloud and law firm resistance to cloud adoption remains, albeit is improving, will that present a blocker to truly transformative use in the short to medium term?
More generally, without seeing more, we’d question how much edge this has over existing search providers in the legal space, whether they be enterprise search, eDiscovery or online research tools. In those domains, the incumbents probably retain their edge… for the time being.
What we saw
In the sprit of legaltech’s #bringbackboring movement, this is a boring but big time saver!
When using DocuSign to organise signing a contract the default steps include uploading the final version of the contract and then manually tagging the parts of the document into which DocuSign should insert:
- editable fields, e.g. the signing party’s name and address; and
- executable fields, e.g. the signature line into which DocuSign users can append their electronic signature.
Having done so the document becomes DocuSignable. Instead, we saw DUAI used to automate this tagging!
What we liked
DocuSign is combined with Google Document Understanding AI to automatically identify and tag these common fields, eliminating around 12 – 20 clicks from the user experience, i.e. clicks required to select the type and location of each field. For a simple document like the one shown in the demo, an NDA, it might seem deceivingly trivial.
However, for a complex LMA style credit agreement with 10 to 20 or so signing capacities with 50+ individual signatories, such auto-tagging of the necessary Docusign fields could be a massive time save over and above what DocuSign currently provides.
That said, their remains some market reluctance to DocuSign significant agreements such as these. In part that is because:
- these types of transactions are multi-jurisdictional;
- the laws surrounding electronic signature in each jurisdiction vary in requirement and clarity;
- law firms are typically required to provide an opinion vouching that the document was signed in a legally binding manner; and
- therefore the combination of (1) and (2) make (3) hard to do in practice for large law firms.
That said, for any other circumstance, this could be a boring but big-time save. As the market moves toward electronic signature more generally this time save will only compound!
Nothing controversial or groundbreaking, so not much to ask other than will this be a new feature in DocuSign? If so, fantastic, and when can users get it?
What we saw
Automatic identification and extraction of data fields from invoices containing tabular data, which is then piped into downstream workflows.
What we liked
This use case is well-trod by the incumbent players. However, the wide variation in layout and presentation between invoices and similar documents like accounts make this use case notoriously tricky. It’s hard to make a robust “one size fits all” solution that can not only extract but normalise like data from different tabular presentations.
What we’d like to know
The explanation here is a little scant. As such it’s hard to assess big picture and ultimate utility. Nevertheless, it suggests DUAI could automate the extraction of key data from invoices (including scanned copies) to populate downstream reporting or workflow automation tools. But as we noted above, this isn’t easy at scale due to variation in tabular layouts.
That said, growing attempts to couple accounting / reporting standards with digital reporting formats (e.g. XBRL) plus increased standardised regulatory reporting needs in financial services could make this use case increasingly doable at scale and therefore immensely valuable to Google and its clients in years to come.
First, who is Iron Mountain (“IM”)?
IM provides solutions for records management, data back-up and recovery, document management, secure shredding and data centres. Many law firms and law firm clients use IM for these needs given regulatory and contractual retention requirements for legal and financial work product and surrounding data.
Second, what have they done with DUAI?
IM used the GCP + DUAI (DUAI sits within the GCP business) to create IM’s Intelligent Content Services Platform (“ICSP”).
Unlike the previous use cases, there was a lot more information to take in. As a result, we will take a slightly different approach to the above.
Third, what is IM’s ICSP?
IM’s ICSP has three layers, essentially:
- an ingestion pipeline;
- a data enrichment layer; and
- a user facing set of viewers to analyse, search, automate information workflows and generally manipulate data.
In other words this seems to be a custom built unstructured data processing platform.
IM articulated several use cases. Most relevant to legaltech were the (a) Mortgage RPA, (b) GDPR and (c)Contract Intelligence use cases.
A. Mortgage RPA
Processing mortgage applications (i.e. does X get a mortgage loan or not based on their paperwork) remains a largely manual process. Despite increasing digitisation of data capture and processing in this space, much remains paper-based.
Unsurprisingly this is another use case popular with incumbent legal AI extraction vendors.
IM used Google Document Understanding AI to automate the answering of the two headline questions necessary to process a mortgage application:
- Is the application complete?
- Is the application accurate?
To understand (1), DUAI classifies documents to identify their type and then confirms if the number and type of documents match those required for a completed mortgage application.
To answer (2), the system then looks:
- inside each document to confirm basic details concerning accuracy, e.g. names, social security numbers, presence or absence of signatures or stamps etc, and
- then across the documents comprising an application to cross-check everything is consistent and, depending on the results, is triaged into the appropriate downstream RPA process.
Most striking is the ROI. IM claim use of DUAI in their ICSP reduced the original labour intensive process from 3-5 days to a mere 5 – 8 hours.
From the below diagram, this seems to be driven by eliminating several layers of human effort.
A side by side comparison of the before and after process is shown below:
B. GDPR Compliance
As above regarding the mortgage RPA use case, this too has been another area of interest for vendors and buyers in the legal AI extraction space.
Being able to laser in on GDPR related information across the enterprise is prohibitively time consuming and costly without a technology solution. This is the case whether you simply need to proactively understand what you can and cannot do with the information given any GDPR concerns, or need to reactively respond to a data subject access request.
To do so Google Document Understanding AI was used to OCR, classify and extract key content from within documents and map those to the relevant policies to ensure they are compliant with GDPR.
Sadly, the presentation didn’t provide much detail beyond this broad statement. So all we’re left with is the fact this is further evidence DUAI appeals to the type of customer already drawn to the incumbent legal AI extraction players, and naturally also the GDPR specific solution providers.
C. Iron Mountain Insight for Contract Intelligence
IM has also used DUAI to process contracts and make them searchable with natural language queries and classifiable at the document and clause level. Again, nothing surprising re this choice of use case. What is surprising is the suggested level of granularity in search that IM was able to produce using DUAI.
Most legal users want, but don’t have, a system that can answer a question like “Show me all the contracts with payment terms greater than 30 days“.
Having just in time information would eliminate a lot of painpoints in legal services and operations, including:
- Due diligence: finding all the contracts expiring within 1 year.
- Negotiation: finding the last X number of documents with clause Y where the other law firm was Z.
- Drafting: finding all examples of similar documents or clauses intended for a specific purpose and / or client.
IM’s presentation suggests this is what they’ve built using DUAI:
Going deeper IM presented an interface powered by DUAI allowing 1 click ability to locate all contracts for customer X:
Deeper still, in 3 clicks IM is able to find all contracts with payment terms of 90 days, which can be sliced and diced by customer and document type.
This same information also appears to be extracted as document metadata on the left side of the below screenshot:
The commentary also highlights use of the above in M&A contexts, e.g. to query a data room with natural language and / or filters on key extracted entities to find all the contracts with a limitation of liability greater than X.
The only downside is the limited detail on the how. In particular, how much of this comes ready to go out of the box vs. requiring lots of customisation and configuration to create custom entity extractions and similar? What level of training and of what is required to build something like the above, in particular, the custom entity extraction? How likely is it that organisations possess those skillsets to the extent they aren’t handheld through the build by Google solution engineers?
This was a Google-built application of Google Document Understanding AI. This solution OCRs patents, uses NLP to discern appropriate categories, extracts patent related entities and splits out diagrams, saving the enriched data to a searchable database:
The entities extracted include:
- Publication date
- Application number
- First line of patent title
With regard to the NLP categorisation of data, the presenter states this required around 500 patents to achieve high 99-100% precision and recall.
Not mind-blowing, but perhaps something the main patent registries could utilise to improve their services.
Likewise, is this something Companies House or the Land Registry could leverage to provide better services to users in terms of search and depth and breadth of querying capabilities?
Overall, some interesting insights but little meaty detail on the how. As unsurprising as it is interesting are the types of use cases and positioning of Google Document Understanding AI. In each case these closely mirror incumbent legal AI extraction providers.
The use cases focus on extracting and classifying inter and intra document data down to the clause and entity level to create searchable databases to expedite due diligence, analytics, search and general knowledge management. Likewise, the overall positioning is naturally about transforming unstructured data into structured data.
Most interesting were the IM use cases, mainly because of the limited screenshots giving a sneaky peak at how Google Document Understanding AI might be made to look in terms of the end user experience.
In a different sense, it was curious that none of the use cases looked particularly similar in execution to the incumbent legal AI extraction vendors’ interfaces, i.e. the two window view with document on one side and extracted entities on the other. For example, the below screenshot from the iManage Extract website, which is fairly illustrative of the incumbent contract extraction tools’ UIs:
It will be interesting to see if anyone uses DUAI to clone incumbent products, either for their own needs and / or to white label the resulting application to their clients. Time will tell.
We hope you enjoyed the teardown. As before, a lot remains unknown and undoubtedly will become clearer as access to DUAI opens up (limited access beta for now) and organisations begin experimenting and publicising use cases.
Exciting times ahead, especially in terms of who will use DUAI for what and why!
If you want to view the video, please check it out below. It’s 50 mins so make yourself a cup of coffee, sit back, relax and enjoy!