The Della approach: taking academic successes to solve real business problems

The Della approach: taking academic successes to solve real business problems

In August 2021, Stanford’s Artificial Intelligence Lab claimed that AI was undergoing “a paradigm shift [in AI] with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks”[1]. Indeed, the use of foundation models enabled human-like performance (and sometimes better) in numerous tasks in Computer Vision, Speech Recognition and Natural Language Processing (NLP). Question Answering (QA) for example is one of the most popular tasks in NLP, transforming the way humans interact with text thanks to frequent breakthroughs pushing state-of-the-art performances ever higher. Google included these innovations in their search engine [2] and is preparing for the next steps. [3] At Della, we believe this will change the way humans interact with computers and this will be the driving force of the next industrial revolution. 

A solution to address real client’s needs

Research and development (R&D) initiatives are incredibly important in ensuring that deep-tech companies can remain relevant and stay afloat in an increasingly competitive pool. However, I would argue that the true added value goes beyond implementing academic breakthroughs in your own technology. The real value comes from adapting the breakthroughs to solve real business problems.

Let’s take Tesla for example. At Tesla AI Day back in August, the company presented the full architecture for their self-driving systems. They laid out all the deep learning breakthrough technologies that they plan to use to build their self-driving cars. This included many of the most recent advances in the academic world: RegNet, SparseRNN and Transformers, all breakthrough technologies for the AI giant. The point is that you can plug all these bricks together but it’s not enough on its own to build a self-driving car. Tesla’s actual added-value comes from their understanding of real business problems that their engineering effort tackles by combining the strengths of these technologies.

It’s a comforting thought that even the massive players in the AI industry like Tesla are struggling with limitations that we are all too familiar with here at Della. Tesla are trying to tackle three key limitations:

  1. Out-of-domain: It’s difficult to train an AI to handle unforeseen circumstances and this is especially true in the context of driving. In an attempt to address this challenge, Testa have created a simulator for their cars which continues to feed their models unlikely scenarios, resulting in an AI with a broader understanding and ability to react appropriately.[4]
  2. Academic problems vs real life problems: We can try our very best to imagine and solve a whole array of problems at a theoretical level. These theoretical solutions are all well and good until we try to use them to solve real business problems. This was an unfortunate realisation for the Tesla team who came to learn that state-of-the-art object detection was not good enough to map out a real road junctions. [5]
  3. Foundation models in production: a car is not a cluster of GPUs. [6] Tesla does not lack resources but their cars only have a limited number of machines. We do not use embedded systems but we have to be efficient and to not waste resources. We need fast inferences so that users get their information as quickly as possible. 

Similarly, our contract review technology is built upon the latest advances in NLP technology models available to us. However our added-value does not come from these models, e.g. BERT, XLM-Roberta or Rembert, which are publicly available. It actually comes mainly from the understanding of the real-world problems that need to be solved and this is done via a combined engineering effort involving data collection, customer feedback, training and good user interface. “Question Answering” is not a goal in itself for us, it is only a step along the way of automating contract review. Many solutions currently available are designed to address specific use cases. This approach to contract review lacks the flexibility that most lawyers require to get the job done. On the contrary, at Della, we are always working to find a good balance between the power offered by large foundation models and the specificities of the legal world, however it’s much easier said than done. We are not trying to build a driverless car, we are trying to find complex answers in documents, however we do encounter the same kind of issues as Tesla.

Finding answers in complex documents is easier said than done

Regarding the issue of out-of-domain adaptation we have a similar issue when dealing with QA datasets. Unfortunately, they are rather general (SQUAD for instance is based on Wikipedia) and they are more often than not in English. Della operates in a more niche legal context using legalese, the formal and technical language of contracts. Our value comes from our ability to leverage transfer learning [7] to achieve good performance in the legal domain and in languages other than English. 

Similarly we face real business challenges that require a more sophisticated solution than theoretical solutions developed by researchers. For instance, in research, the QA field is typically divided into two subtopics: “closed-domain QA” and “open-domain QA”. While the former aims to find the answer to a question in a small chunk of text (usually fewer than 500 words), the latter seeks the answer in a large corpus of documents (usually relying on a first-stage retrieval). But neither option sufficiently corresponds to our use case. Our goal is to ask questions on documents of various lengths. These documents have an internal layout with headings and therefore treating them as a corpus of small documents and using Open-domain QA would be too generic. Another example is the ability to answer questions with a yes or no answer. In the academic world, this is often seen as a binary classification problem but in fact, most lawyers would often ask questions for which the answer is not necessarily “yes” or “no” but could also be a list of conditions.

Finally, most widely praised foundation models contain a tremendous amount of parameters (i.e. of weights to tune), such as GPT-3 which is made up of 175 billion parameters. The sheer volume of parameters means that training such models is extremely expensive and immensely time-consuming. [8] One inference with GPT-3 requires the use of several GPUs. But our clients need an answer within seconds. Hence, if we can reach as good a performance with smaller models, we will. Moreover, architectural and computational optimisations are paramount. Della’s infrastructure is scalable at its core.

I’m sure it’s obvious by now that R&D is absolutely essential at Della. The research into foundation models is fascinating and we cannot get enough of it; we are truly passionate but the world needs actors like us to make them work in the real world. There are still plenty of exciting challenges ahead and we do not tire of tackling them, one by one. As we continue to grow, we’re doing our bit to contribute to the ongoing exploration of AI and how theoretical solutions can be adapted to solve real world business problems. Watch this space!

[1], the process of using a foundation model and to adapt it to a downstream task, using relevant data (e.g. company’s clients’ data) is called fine-tuning.
[4] Tesla AI Day (YouTube)
[5] Tesla AI Day (YouTube) 
[6] Graphical Processing Units. Understand, very powerful computer, essential for deep learning models.
[7] Phenomenon that enables large language models to transfer knowledge acquired from a domain or language to another.
[8] GPT-3’s training is estimated to have cost 10m$ (source:


Getting started with Della is easy

Accelerate your contract review process and make faster business decisions today.

getting started with the della platform