SemEval is a series of international natural language processing (NLP) research workshops whose mission is to advance the current state of the art in semantic analysis and to help create high-quality annotated datasets in a range of increasingly challenging problems in natural language semantics. Each year’s workshop features a collection of shared tasks in which computational semantic analysis systems designed by different teams are presented and compared.
The 16th edition of SemEval features 12 TASKS on a range of topics, including tasks on idiomaticy detection and embedding, sarcasm detection, multilingual news similarity, and linking mathematical symbols to their descriptions. Several tasks are multilingual, and others ask for multimodal approaches
Our team’s members choose two tasks from the 12 tasks list:
- Task 1: CODWOE – COmparing Dictionaries and WOrd Embeddings: The CODWOE shared task invites to compare two types of semantic descriptions: dictionary glosses and word embedding representations. Are these two types of representation equivalent? Can we generate one from the other?
- Task 8: Multilingual news article similarity: The difference to document similarity as it is usually conceived: Here, we are interested in the real world-happenings covered in the news articles, not their style of writing, political spin, tone, or any other more subjective “design decision” imposed by a medium/outlet. The main sub-dimensions of similarity are Geolocation, Time, Shared Entities, Shared Narratives. This allows to assess to what extent outlets write about “the same things”.
They choose those tasks because they are directly linked to several problematic meets in the DRIT about semantic representation of textual elements.
The calendar is the following:
- 10th of January 2022: Beginning of the evaluation
- 31st of January 2022: End of the evaluation
- 23rd of February 2022: Submission of scientific articles
- 31st March 2022: Notification to authors
If the team is selected, they’ll have the opportunity to take part to the NAACL (Annual Conference of the North American Chapter of the Association for Computational Linguistics) workshop plan for summer 2022 in Seattle, Washington. A great opportunity to meet other scientists from all over the world and exchange knowledge.
Our team members who take part to the challenge are Mokhtar Boumedyen BILLAMI, Christophe BORTOLASO, Sébastien DUFOUR, Camille GOSSET, Julien BRETON, Mehdi KANDI, Youssef MILOUDI, Lina NICOLAIEFF, Karim BOUTAMINE and Nihed BENDAHMAN.