Julien at EKAW 2024 for his work on extracting legal data from LLMs

Julien Breton au colloque Ekaw 2024

On December 3, 2024 in Amsterdam, Julien Breton – Research Engineer at Berger-Levrault (Team BL.Research) took part in the EKAW 2024 (European Knowledge Acquisition Workshop) conference. His paper demonstrates how the alliance between large language models (LLMs) and software architectures can optimize data annotation: a consequent time and cost saving, and greater reliability for technical agents.

Knowledge management at the heart of technological challenges

The EKAW (European Knowledge Acquisition Workshop) is the key annual event for the international community in the field of knowledge engineering and acquisition. It brings together experts in artificial intelligence, language models, data extraction and summarization… Its cross-cutting theme is to identify and promote advances in structuring and analyzing complex information to make it more accessible and usable.

Advances in legal data extraction

Julien shared part of his research into the interconnection between information systems and legal documents, with a view to optimizing the performance of business software: “Empower CamemBERT Legal Entity Extraction With LLM Bootstrapping”.This presentation is part of his thesis work carried out in collaboration with the SIG (Generalized Information Systems) team at IRIT (Institut de Recherche en Informatique de Toulouse) and BL Research at Berger-Levrault.This research has also been recognized as a Spotlight Paper, a distinction highlighting the potential of this work in the evolution of knowledge exploitation applied to complex fields, notably legal and regulatory.

Q/R Session – Presentation by Julien Breton : “Empower CamemBERT Legal Entity Extraction With LLM Bootstrapping”

Artificial intelligence for automatic annotation

The core of this research is based on the integration of Large Language Models (LLM) with a traditional model such as BERT (Bidirectional Encoder Representations from Transformers). This hybrid approach limits dependence on human annotators, particularly in the legal field, where experts are both rare and costly. Indeed, manual annotation of data, essential for model training, is a time-consuming and costly process, especially when dealing with complex texts such as legal documents. The introduction of LLMs, combined with BERT, makes it possible to automate the extraction of legal entities from unstructured texts, while considerably reducing the need for human intervention. This innovation offers a concrete solution to the problems of processing and analyzing legal documents in industrial contexts.

Optimizing legal maintenance processes – The case of CARL Source

This progress has direct application in the solutions offered by Berger-Levrault, for example with Carl Source, the company’s computer-aided maintenance management software (CMMS). The legal entity extraction system developed by Julien will eventually enable Carl Source software to be connected with legal obligations in terms of regulatory maintenance. In concrete terms, this will enable the automatic retrieval of legal obligations relating to company assets, and the updating of maintenance plans in line with legislative developments.
The aim is to automate the integration of legal constraints into business process management, thereby reducing the risk of human error, while ensuring optimum compliance with current regulations. This approach also saves valuable time by reducing the need for legal intelligence to update maintenance plans.

Towards relationship extraction

The next steps in this research are aimed at implementing the next technical building block in this process: relationship extraction. Coupled with entity extraction, the processing chain will be able to extract regulatory obligations from legal documents. In this context, the system will be able not only to isolate the relevant information, but also to build structured representations following the semantic model that was the subject of a previous scientific contribution.

An essential step forward for the future of knowledge management

Through his contribution to the EKAW 2024 conference, Julien has laid the foundations for the evolution of artificial intelligence technologies applied to legal data extraction. This innovative approach is in line with Berger-Levrault’s ambition to offer its customers solutions that will enable them to remain compliant with legal developments, while optimizing their management processes.

Congratulations Julien and to the other contributors Mokhtar Boumedyen Billami, Max Chevalier and Cassia Trojahn for this fine recognition of their expertise and innovation in knowledge management!

More ...

Scroll to Top