Camille Gosset Ph.D. thesis defense: “Methods and Models for the Automated Construction of Knowledge Graphs in the Legal Domain: Application to the Legal and Legal-Practical Resources of Local and Regional Authorities”

2 minutes read

Thursday 26th of September at 10h a.m. Paris time, Camille Gosset, Ph.D. Candidate has defended her thesis named “Methods and Models for the Automated Construction of Knowledge Graphs in the Legal Domain: Application to the Legal and Legal-Practical Resources of Local and Regional Authorities”. Her thesis defense took place at the LIRM (Laboratory of Computer Science, Robotics and Microelectronics) in Montpellier, France. Take a look at the summary below :

Summary

This thesis examines the construction of knowledge graphs from texts, focusing primarily on information extraction from unstructured texts. The objective of this work is to explore various aspects of information extraction from specialized corpora in the legal domain. To this end, we divide this study into two subtasks: terminology extraction and relation extraction.

Terminology extraction aims to automatically identify relevant terms in a given corpus of texts. Subsequently, from these terms, we extract the relations that link them. For relation extraction, two approaches are conceivable : either by determining in advance the types of relations that structure the terms, or by using the context, notably the verbs or other actions that link these terms (in the field of OpenIE). Thus, we address three main sub-problems. We introduce the terminology extraction system InfoGlean KeyTerms, composed of three modules : one for named entity recognition (NER), one for the extraction of relevant terms/text segments (KPE), and a final one for the extraction of legal entities. Expert annotations were provided in addition to this system to constitute a terminological base. After building this terminological base, we implemented two relation extraction systems: Relational Embeddings Model (REM) and GPT Open Relation EXtraction (GOREX). REM identifies typed relations between the extracted terms using the lexical network rezo-JDM. REM represents the relation pairs using a Word2Vec model, then classifies the types of relations. GOREX, on the other hand, exploits the principle of OpenIE by focusing on the verbs or action terms in the local context of the terms. GOREX uses LLMs to perform this task.

The analysis of the results revealed promising research avenues to be explored in future work for all systems. More specifically, the implementation of a hybrid relation extraction system could be an interesting path to explore.

The jury was composed of:

Examiners

Sandra BRINGAY, University Professor, Université Paul Valéry, LIRMM
Marianne HUCHARD, University Professor, University of Montpellier, LIRMM
Didier SCHWAB, University Professor, LIG, UMR 5217
Cassia TROJAHN, Senior Lecturer (HDR), UT2, IRIT – UMR 5505

Supervisors :

Mokhtar Boumedyen BILLAMI, Berger-Levrault, co-supervisor
Mathieu LAFOURCADE, Senior Lecturer (HDR), University of Montpellier, Thesis Director

And after?

Exploiting the knowledge present in raw text is difficult. This is why
knowledge needs to be structured for rapid access.
For Berger-Levrault, publisher of business software for local authorities and industry alike, structuring knowledge has become a major challenge.

Berger-Levrault develops software solutions specializing in Automatic Natural Language Processing (ANLP), such as search engines, response engines, automated legal intelligence and many others. These software solutions therefore need to exploit Berger-Levrault’s knowledge of textual data. Until now, however, they have been limited to the full extent of the raw knowledge. It is therefore necessary to structure this knowledge so that it can be used to enrich applications and thus improve them. This thesis focuses on two aspects in particular: (1) The representation of all knowledge through a “knowledge structure”; and (2) the use of this knowledge to improve the performance of several of its products.

More ...

In details

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Camille Gosset Ph.D. thesis defense: “Methods and Models for the Automated Construction of Knowledge Graphs in the Legal Domain: Application to the Legal and Legal-Practical Resources of Local and Regional Authorities”

Summary

And after?

More ...

Duck: A Customized LSP for Berger-Levrault. Because, why not ?

Enhancing Functional Testing Efficiency in Industrial Large-Scale Applications: Pertinent Test Selection

Celebrating New PhDs from the BL.Research Team!

Nicolas Ringuet Ph.D. thesis defense: “Modeling, Exploration, and Explainable Co-construction of Lifelong Pathways”

Camille Gosset Ph.D. thesis defense: “Methods and Models for the Automated Construction of Knowledge Graphs in the Legal Domain: Application to the Legal and Legal-Practical Resources of Local and Regional Authorities”

Summary

And after?

More ...

Duck: A Customized LSP for Berger-Levrault. Because, why not ?

Enhancing Functional Testing Efficiency in Industrial Large-Scale Applications: Pertinent Test Selection

Celebrating New PhDs from the BL.Research Team!

Nicolas Ringuet Ph.D. thesis defense: “Modeling, Exploration, and Explainable Co-construction of Lifelong Pathways”

Start typing and press enter to search