lawtomated
  • Home
  • About
  • Law
    • All Access to Justice Future lawyers Knowledge Management Law Firms Open-source law Practice of Law
      Access to Justice

      Divorce disruptors – how LawTech start-up amicable is…

      Buying Software

      Selling to Legal Teams: Attention to Detail

      Buying Software

      Selling to Legal Teams: Who to Sell

      Buying Software

      Selling to Legal Teams: 3 Mistakes To Avoid

      Access to Justice

      Divorce disruptors – how LawTech start-up amicable is…

      Future lawyers

      To Code or Not to Code: should lawyers…

      Knowledge Management

      Google Document Understanding AI – features, screenshots and…

      Knowledge Management

      Structured Data vs. Unstructured Data: what are they…

      Law Firms

      Selling to Legal Teams: Attention to Detail

      Law Firms

      Selling to Legal Teams: Who to Sell

      Law Firms

      Selling to Legal Teams: 3 Mistakes To Avoid

      Law Firms

      Killer software demos that win legaltech pitches

      Open-source law

      Open Source Contracts: Part 4

      Open-source law

      Open Source Contracts: Part 3

      Open-source law

      Open Source Contracts: Part 2

      Open-source law

      Open Source Contracts: Part 1

      Practice of Law

      Why are lawyers unhappy?

  • Legaltech
    • All Buying Software Selling Software
      Events

      Introducing Legal Innovators California – 9th June 2022…

      Events

      Future Lawyer Week UK is coming to London…

      Legaltech

      Is mobile the future of legaltech?

      Legaltech

      Future Lawyer Week 2021!

      Buying Software

      Why you should look beyond legaltech: 4 surprising…

      Buying Software

      Selling to Legal Teams: Attention to Detail

      Buying Software

      Selling to Legal Teams: Who to Sell

      Buying Software

      Selling to Legal Teams: 3 Mistakes To Avoid

      Selling Software

      Selling to Legal Teams: Attention to Detail

      Selling Software

      Selling to Legal Teams: Who to Sell

      Selling Software

      Selling to Legal Teams: 3 Mistakes To Avoid

      Selling Software

      Killer software demos that win legaltech pitches

  • Coding
    • Coding

      Coding for beginners: 10 tips on how you…

      Coding

      Coding for beginners: what to learn, where, how…

      Coding

      Coding for beginners: what to learn, where, how…

      Coding

      To Code or Not to Code: should lawyers…

      Coding

      Open Source Contracts: Part 4

  • Careers
    • All Guide Profile
      Careers

      Legaltech Careers: Sharan Kaur, Legaltech Consultant

      Careers

      Legaltech Careers: Dave Wilson, Managing Director & Founder…

      Careers

      The Legaltech cheat sheet. All you need to…

      Careers

      Leaving the Law for Legaltech, Legal Ops or…

      Guide

      The Legaltech cheat sheet. All you need to…

      Guide

      Leaving the Law for Legaltech, Legal Ops or…

      Guide

      Legaltech Careers: Nitish Upadhyaya, Senior Innovation Manager, A&O’s…

      Guide

      Legaltech Careers Guide: Roles, Salaries & Work /…

      Profile

      Legaltech Careers: Sharan Kaur, Legaltech Consultant

      Profile

      Legaltech Careers: Dave Wilson, Managing Director & Founder…

      Profile

      Legaltech Careers: Mary Bonsor, CEO and Co-Founder of…

      Profile

      Legaltech Careers: Devshi Mehrotra, CEO & Co-Founder of…

  • A.I.
    • All Accuracy, Precision, Recall & F1 Score Deep Learning Hype I.A. Machine Learning Reinforcement Learning Supervised Learning Unsupervised Learning
      A.I.

      Contracts and the data capture challenge

      A.I.

      The evolution of Natural Language Processing and its…

      A.I.

      Legaltech adoption barriers. How many apply to your…

      A.I.

      Explainable AI – All you need to know….

      Accuracy, Precision, Recall & F1 Score

      4 things you need to know about AI:…

      Deep Learning

      The evolution of Natural Language Processing and its…

      Deep Learning

      Explainable AI – All you need to know….

      Deep Learning

      Machine learning with school math. Yes, you learnt…

      Deep Learning

      10 hype busting A.I. articles everyone should read

      Hype

      10 hype busting A.I. articles everyone should read

      Hype

      Can your AI vendor answer these 17 questions?…

      Hype

      Why the “I” in A.I. needs to go

      I.A.

      I.A. vs. A.I. – what’s the difference and…

      Machine Learning

      Contracts and the data capture challenge

      Machine Learning

      The evolution of Natural Language Processing and its…

      Machine Learning

      Explainable AI – All you need to know….

      Machine Learning

      Machine learning with school math. Yes, you learnt…

      Reinforcement Learning

      10 hype busting A.I. articles everyone should read

      Reinforcement Learning

      A.I. Technical: Machine vs Deep Learning

      Supervised Learning

      Machine learning with school math. Yes, you learnt…

      Supervised Learning

      4 things you need to know about AI:…

      Supervised Learning

      10 hype busting A.I. articles everyone should read

      Supervised Learning

      A.I. Technical: Machine vs Deep Learning

      Unsupervised Learning

      Machine learning with school math. Yes, you learnt…

      Unsupervised Learning

      10 hype busting A.I. articles everyone should read

      Unsupervised Learning

      A.I. Technical: Machine vs Deep Learning

      Unsupervised Learning

      Google enters the contract extraction space!

  • Contact
lawtomated
  • Home
  • About
  • Law
    • All Access to Justice Future lawyers Knowledge Management Law Firms Open-source law Practice of Law
      Access to Justice

      Divorce disruptors – how LawTech start-up amicable is…

      Buying Software

      Selling to Legal Teams: Attention to Detail

      Buying Software

      Selling to Legal Teams: Who to Sell

      Buying Software

      Selling to Legal Teams: 3 Mistakes To Avoid

      Access to Justice

      Divorce disruptors – how LawTech start-up amicable is…

      Future lawyers

      To Code or Not to Code: should lawyers…

      Knowledge Management

      Google Document Understanding AI – features, screenshots and…

      Knowledge Management

      Structured Data vs. Unstructured Data: what are they…

      Law Firms

      Selling to Legal Teams: Attention to Detail

      Law Firms

      Selling to Legal Teams: Who to Sell

      Law Firms

      Selling to Legal Teams: 3 Mistakes To Avoid

      Law Firms

      Killer software demos that win legaltech pitches

      Open-source law

      Open Source Contracts: Part 4

      Open-source law

      Open Source Contracts: Part 3

      Open-source law

      Open Source Contracts: Part 2

      Open-source law

      Open Source Contracts: Part 1

      Practice of Law

      Why are lawyers unhappy?

  • Legaltech
    • All Buying Software Selling Software
      Events

      Introducing Legal Innovators California – 9th June 2022…

      Events

      Future Lawyer Week UK is coming to London…

      Legaltech

      Is mobile the future of legaltech?

      Legaltech

      Future Lawyer Week 2021!

      Buying Software

      Why you should look beyond legaltech: 4 surprising…

      Buying Software

      Selling to Legal Teams: Attention to Detail

      Buying Software

      Selling to Legal Teams: Who to Sell

      Buying Software

      Selling to Legal Teams: 3 Mistakes To Avoid

      Selling Software

      Selling to Legal Teams: Attention to Detail

      Selling Software

      Selling to Legal Teams: Who to Sell

      Selling Software

      Selling to Legal Teams: 3 Mistakes To Avoid

      Selling Software

      Killer software demos that win legaltech pitches

  • Coding
    • Coding

      Coding for beginners: 10 tips on how you…

      Coding

      Coding for beginners: what to learn, where, how…

      Coding

      Coding for beginners: what to learn, where, how…

      Coding

      To Code or Not to Code: should lawyers…

      Coding

      Open Source Contracts: Part 4

  • Careers
    • All Guide Profile
      Careers

      Legaltech Careers: Sharan Kaur, Legaltech Consultant

      Careers

      Legaltech Careers: Dave Wilson, Managing Director & Founder…

      Careers

      The Legaltech cheat sheet. All you need to…

      Careers

      Leaving the Law for Legaltech, Legal Ops or…

      Guide

      The Legaltech cheat sheet. All you need to…

      Guide

      Leaving the Law for Legaltech, Legal Ops or…

      Guide

      Legaltech Careers: Nitish Upadhyaya, Senior Innovation Manager, A&O’s…

      Guide

      Legaltech Careers Guide: Roles, Salaries & Work /…

      Profile

      Legaltech Careers: Sharan Kaur, Legaltech Consultant

      Profile

      Legaltech Careers: Dave Wilson, Managing Director & Founder…

      Profile

      Legaltech Careers: Mary Bonsor, CEO and Co-Founder of…

      Profile

      Legaltech Careers: Devshi Mehrotra, CEO & Co-Founder of…

  • A.I.
    • All Accuracy, Precision, Recall & F1 Score Deep Learning Hype I.A. Machine Learning Reinforcement Learning Supervised Learning Unsupervised Learning
      A.I.

      Contracts and the data capture challenge

      A.I.

      The evolution of Natural Language Processing and its…

      A.I.

      Legaltech adoption barriers. How many apply to your…

      A.I.

      Explainable AI – All you need to know….

      Accuracy, Precision, Recall & F1 Score

      4 things you need to know about AI:…

      Deep Learning

      The evolution of Natural Language Processing and its…

      Deep Learning

      Explainable AI – All you need to know….

      Deep Learning

      Machine learning with school math. Yes, you learnt…

      Deep Learning

      10 hype busting A.I. articles everyone should read

      Hype

      10 hype busting A.I. articles everyone should read

      Hype

      Can your AI vendor answer these 17 questions?…

      Hype

      Why the “I” in A.I. needs to go

      I.A.

      I.A. vs. A.I. – what’s the difference and…

      Machine Learning

      Contracts and the data capture challenge

      Machine Learning

      The evolution of Natural Language Processing and its…

      Machine Learning

      Explainable AI – All you need to know….

      Machine Learning

      Machine learning with school math. Yes, you learnt…

      Reinforcement Learning

      10 hype busting A.I. articles everyone should read

      Reinforcement Learning

      A.I. Technical: Machine vs Deep Learning

      Supervised Learning

      Machine learning with school math. Yes, you learnt…

      Supervised Learning

      4 things you need to know about AI:…

      Supervised Learning

      10 hype busting A.I. articles everyone should read

      Supervised Learning

      A.I. Technical: Machine vs Deep Learning

      Unsupervised Learning

      Machine learning with school math. Yes, you learnt…

      Unsupervised Learning

      10 hype busting A.I. articles everyone should read

      Unsupervised Learning

      A.I. Technical: Machine vs Deep Learning

      Unsupervised Learning

      Google enters the contract extraction space!

  • Contact
A.I.Deep LearningMachine Learning

The evolution of Natural Language Processing and its impact on the legal sector.

Why learning by example is the only way to optimise language models

by info@lawtomated.com July 19, 2021
July 19, 2021 0 comment
12 min read

Guest post by Amy Flippant from Della.

When I moved to Spain at the age of 21, I thought my GCSE-level knowledge of grammar rules and structures would help me get by. What I later learned was that hearing Spanish, in real-life scenarios, was the best way to expand my vocabulary and develop my linguistic skills. It turns out that when it comes to Natural Language Processing (NLP) the same logic applies. 

Language is not just a set of independent meaningful words (a vocabulary), these words only take on meaning when they are understood in context. So, instead of trying to specify language with formal grammar rules and structures, you’re better off training the model (or your brain) with real-life examples. This is particularly true when NLP is being used in a legal setting, for example, for contract review and analysis.

Evolution of NLP

The goal of natural language processing (NLP) is to build a technique or model that can understand human language as it is used naturally. Let’s take a look at the evolution of NLP, where it started and where we are today in our quest to imitate and understand human language and apply it to different tasks.

GloVe and word2vec

Traditional methods of language processing, known as word embedding, fell short of capturing the full representation of a text’s meaning. The most common approaches, namely GloVe and word2vec, mapped every single word to a vector. Since this approach looked at each word in isolation, it meant that it could not capture the full contextual meaning of the sentence.

Word2Vec Diagram

Let’s unpick the limitations of these traditional methods by looking at an example: “I miss you” is a simple sentence made up of three words. You cannot determine who misses who by looking at one word in isolation. You need all three words to know that I am the one missing you. A language model must understand the basic structure between words in order to understand the full meaning of a sentence, no matter how simple the sentence might be.

RNN & LSTMs

The next wave of NLP techniques looked at the relationship between the words in a sentence to determine the meaning. At the time, the most commonly used approach to capture dependencies between words were deep learning approaches, namely Recurrent Neural Networks (RNN), or more specifically LSTMs (Long Short-Term Memory).

This state of the art algorithm is used to process sequential data, most notably in the creation of Siri and Google’s voice search. It is the first algorithm built with short-term memory, meaning it remembers prior information that it has been exposed to. While these advances are important, these techniques can only transfer information in a certain order, so the longer your sentence, the less likely it is that the information it contains will be fully retained.

Let’s go back to our previous example: Imagine you have two sentences now: ‘I miss you’ and ‘you miss me’. In these examples, the word ‘you’ takes on two different contextual meanings, being both the subject of the sentence and the object. We need to train our LTSM model for both these instances because the meaning of ‘you’ cannot be learned independently using the same LTSM model. 

RNN & LTSM Diagram

So what does this mean? It means that a significant amount of data is required for very simple tasks, such as sentiment analysis (determining if a sentence is positive, neutral or negative). Fortunately, the academic world was quick to acknowledge this inherent flaw and a breakthrough came from teams working on language models. They saw the way that numerical outputs could represent meaning and maintain contextual dependencies between words. In 2018, a paper was published introducing the language model ‘ELMo”. This paper effectively established that it was possible to improve model efficiency by implementing an overall understanding of language. 

If we refer back to our previous examples, ‘I miss you’ and ‘you miss me’, ELMo could understand the word ‘you’ in its own context without having to retrain the model each time. This enabled better results for most NLP tasks.

The problem with the ELMo model was that it was built using LSTM and as previously mentioned, all the data had to be transferred in a specific order and calculations relied on the previous results. This meant that ELMo could not take advantage of parallel processing (i.e. multiple computers training the model at the same time) which significantly slowed down the ability to train the model.

Who or what is BERT?

In 2017, Google introduced Transformers, which was deemed to be a significant improvement on RNN because it could process data in any order. This was significant because the Transformer can process any word in relation to all other words in a sentence, rather than processing them one at a time.

Then the breakthrough happened. Google used the academic’s language mode intuition, but developed a language model based on Transformers; this model is called BERT (Bidirectional Encoder Representations from Transformers). It is a machine learning technique applying the bidirectional training of Transformer’s encoder.

By looking at all surrounding words, the Transformer enables BERT to understand the full context of the word. The revolutionary approach takes the input of any arbitrary text and creates an output of a fixed length vector (a numeric representation of the text). Think of this like a translation step into a secret machine language.

Using Transformer’s bidirectional capability, BERT is pre-trained on two different NLP tasks:

Masked Language Modeling

The aim of Masked Language Modeling is to generate accurate numerical representations of words. It does this by randomly hiding one word (or many words) in a sentence and then having the programme predict what the ‘masked’ word is based on the context of the sentence. BERT is forced to identify the masked word based on context alone, rather than by any pre-fixed identity. For example: ‘I need to call my boss, my ___ was late this week’. By looking at the entire context of the sentence, you might argue that the most likely word to fit here would be ‘paycheck’.

Next Sentence Prediction

The next sentence prediction forces the model to understand how the combinations of words in sentences fit together. The programme predicts whether any two sentences have a logical, sequential connection or if their relationship is random. Let’s a take a look at two sentences:

  1. I’m really enjoying this film; and
  2. I’m worried about work.

Which of the two has a logical, sequential connection to ‘I need to call my boss, my paycheck was late this week’? BERT works by clustering concepts based on sentiment. So, imagine ‘love’ and ‘hate’ plotted on a graph, you would assume that these concepts would be far apart. However, sentences that express similar sentiments, like ‘I am so disappointed in this corporation’ and ‘I don’t like this company’ would be clustered together, despite the fact that different words are used.

Measuring Sentiment Diagram 1
Measuring Sentiment Diagram 2

BERT is the rocket booster to kickstart your NLP projects

BERT addresses the flaws of previous NLP techniques, namely the need to spend hours training models on data to make them sufficiently smart and, in most cases, solving the issue of having enough training data to provide the models with a well-rounded education.

BERT diagram

In contrast, the BERT framework was pre-trained using the entire Wikipedia corpus. This means that BERT has already had exposure to an enormous variety of language structures, nuances and ambiguous language. The best bit is that BERT can be used as a base layer of knowledge in your NLP projects. You can build upon it and fine-tune it for your specific needs. This process is called transfer learning and it means that rather than starting from scratch, you can get your model up to speed on a lot less data. It is cheaper, faster and ultimately smarter because you are building upon an existing layer of intelligence.

Making language models legal-savvy

Despite the fact that using AI to analyse text has never been easier, the revolution has not yet arrived in the legal world. Legal technology has emerged  in recent years to provide solutions to help lawyers read contracts. Using language models to analyse content-rich documentation, like contracts, is making waves in the most traditional of industries. 

Surprisingly, many of the key players on the market still rely on training individual small models for specific tasks. This training process requires large amounts of niche legal data and human supervision to get the model up to speed. It is expensive and tedious. But the crucial point is that this type of contract analysis technology isn’t true AI. The lawyer still has to manually review the clause and spend time finding the answer to the specific question. To reach true AI, we want a tool that does more than just ‘highlight’ the clause that you need to review but actually finds the answers to your questions for you, thereby providing you with actionable information quickly.

So why are legal technologies falling into the legacy trap when there is an easier, cheaper and smarter way to do it?

If you think about it, massive language models, like BERT, are pre-trained with such enormous amounts of data, that the base level of natural language is already at a very high level to begin with. If you take this model and fine-tune it with exposure to very specific legal terminology, it will be able to incorporate it quickly into its pre-trained ‘brain’. 

By way of comparison, if we think back to my experience in Spain: if I already had a C1 level knowledge of Spanish prior to moving to Madrid, I would have found it a lot easier to fine-tune my linguistic skills with examples of more colloquial ways of speaking or even incorporate very specific vocabulary.

The point is that these massive language models have already learned from so many examples, that they require far fewer example contracts to be able to pick up ‘legal language’ than small models that are built from scratch. 

At Della, we take full advantage of the advances made in NLP models like transfer learning and we regularly take time to analyse the latest research to apply it to our model. So the question is, why aren’t the other providers taking advantage of these developments? There is the possibility that the current generation of legal service providers may fall into the same trap that they would have told their customers to avoid; attachment to legacy techniques.

Useful resources
  • https://searchenterpriseai.techtarget.com/definition/BERT-language-model
  • https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270
  • https://dataskeptic.com/blog/episodes/2019/bert
  • https://dellalegal.com/language-models-the-technology-behind-della/
  • https://www.capgemini.com/gb-en/2021/03/bias-in-nlp-models/

About our guest author, Della

Della is a new, simple to set up and easy to use, contract analysis and review tool, which is supplanting established tools by offering lawyers a fast, intuitive way to review contracts. While traditional contract review focuses on clause detection and data extraction, Della takes away that complexity. Using proprietary AI and NLP that lets users ask questions about their contracts in plain English (and multiple other languages), and get clear answers. Quickly. Providing Google-like functionality that enables legal professionals to gain actionable insight into their contracts, without much of the manual effort and training that is required to set up and use many existing tools.

Della learns every time it’s used, allowing users to review their contracts faster. In fact, it takes 10 mins for a new user to start using Della. Powered by massive language models, Della reduces time-to-value, while ensuring the use of a sophisticated contract analysis and review tool is achievable for firms of all sizes. Della’s AI is built on a ‘massively pre-trained language model’, which understands how real people speak. Taking into account the syntax and grammar of a query. Instead of many small models trained to detect specific clauses or data points, Della uses a single model, which has been trained on hundreds of thousands of questions and answers in over 100 languages to date, to perform in-depth analysis and review.

Della’s platform launched in January (2020), but it is already being used by small and large law firms across multiple countries and several large multinational corporations. Those law firm partners range in size, from top UK and European law firms, to smaller boutique providers and enterprise organisations. Della’s customers include: Eversheds Sutherland, Fidal, BCLP, Wolters Kluwer, and Content Square (USA). The smallest law firm currently using Della has 12 lawyers. Last year, Della launched a partnership with Wolters Kluwer to provide Della to their contract management platform customers.

BERTDella AIGloVeLSTMRNNWord2Vec
0 comment
previous post
Legaltech Careers: Dave Wilson, Managing Director & Founder of Tiger Eye Consulting
next post
Legaltech Careers: Sharan Kaur, Legaltech Consultant

Related Posts

Contracts and the data capture challenge

October 14, 2021

Legaltech adoption barriers. How many apply to your...

June 28, 2021

Search

Stay in touch

Facebook Twitter Instagram Linkedin Email

Tweets

Lots of #legaltech reinvents tech adopted outside legal, often without awareness of needless wheel reinvention. 1… https://t.co/z9583JSfR8

02-Nov-2022

Reply Retweet Favorite
What are some of the best use cases and / or mini applications of MS Excel for legal tasks? My most used features… https://t.co/dRSVkI9O5v

10-Oct-2022

Reply Retweet Favorite

Popular Posts

  • 1

    Open Source Contracts: Part 2

  • 2

    Legaltech Careers Guide: Roles, Salaries & Work / Life Balance

  • 3

    A.I. Technical: Machine vs Deep Learning

  • 4

    Introducing Legal Innovators California – 9th June 2022 in San Francisco

  • 5

    Explainable AI – All you need to know. The what, how, why of explainable AI

Categories

Tags

A.I. AI Artificial Intelligence Avvoka Buying Software Career Profile Careers Coding Contract Data Deep Learning DMS Document Management System EdX Git GitHub Google Hype iManage Javascript Law Law Firms Lawtech Lawyers Legal Legal A.I. Legal Drafting Legal Innovation Legal Ops Legal Teams Legaltech Legatics Linux Machine Learning Marginal Gains Office & Dragons Open-source law Open-source software Open Source Initiative OSS Python Search Selling Software Supervised Learning Unsupervised Learning
  • Facebook
  • Twitter
  • Instagram
  • Linkedin
  • Email
  • Reddit
  • RSS

@2020 - All Rights Reserved Lawtomated