3rd International Workshop on Computational Approaches to Historical Language Change 2022 (LChange'22)

The 3rd LChange workshop will be colocated with ACL 2022 in Dublin, Ireland, as a hybrid event. If you are having trouble connecting to GatherTown, you can try Discord: https://discord.gg/SvNjyftg . If you have questions for Nikolay and the best papers of the shared task, they will be on Discord. The workshop dates are May 26-27 where the first day will be devoted to workshop papers, and the second day to the shared task on semantic change detection for Spanish. Our first confirmed speaker is Professor Dirk Geeraerts, our second keynote speaker is Dominik Schlechtweg. The workshop builds upon its first iteration in 2019, and the second edition in 2021. We hope to make the most out of the hybrid format for meeting both in person and online and look forward to meeting you all!

The call for papers is be similar to last time: all aspects around computational approaches to historical language change with the focus on digital text corpora. LChange'19 resulted in a book on Computational approaches to semantic change, and this year, we are considering a special issue journal for invited workshop papers.

We will offer mentoring for PhD students and young researchers in one-on-one meetings during the workshop, on the second day. If you are interested, send us a short description of your work and we will set you up with one of the organizers of this workshop. If your paper is rejected from the workshop, we can also provide advice on improving it for future submission. This offer is limited, and will be chosen based on topical fit and availability of appropriate mentors. Deadline for applying for mentorship is May 30th via email.

Important Dates

  • March 4, 2022: Paper submission
  • March 14, 2022: Task description papers
  • March 30, 2022: Notification of acceptance
  • March 30, 2022: Deadline for mentorship application
  • April 10, 2022: Camera-ready papers due
  • May 26-27, 2022: Workshop date

Programme

The workshop will take place as a hybrid event on May 26-27. All times in the programme are (GMT+1) -- Local Time in Dublin, Ireland.

Day 1 -- May 26th

Start–End Title Author(s) Link(s)
9:00–9:15 Introduction Workshop organisers
SESSION 1 Chair: Mario Giulianelli
9:15–9:40 Low Saxon dialect distances at the orthographic and syntactic level Janine Siewert, Yves Scherrer, Martijn Wieling
9:40–10:05 Phonological Reconstruction Using Trimmed Alignments and Sound Correspondence Patterns Johann-Mattis List, Robert Forkel, Nathan Hill
10:05–10:35 What is Done is Done: an Incremental Approach to Semantic Shift Detection Francesco Periti, Alfio Ferrara, Stefano Montanelli, Martin Ruskov
10:35–11:05 BREAK
SESSION 2 Chair: Nina Tahmasebi
11:05–11:30 Lexicon of Changes: Towards the Evaluation of Diachronic Semantic Shift in Chinese Jing CHEN, Emmanuele Chersoni, Chu-Ren Huang
11:30–12:00 Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change Mario Giulianelli, Andrey Kutuzov, Lidia Pivovarova
12:00–12:30 Using Cross-Lingual Part of Speech Tagging for Partially Reconstructing the Classic Language Family Tree Model Anat Samohi, Daniel Weisberg Mitelman, Kfir Bar
12:30–14:00 LUNCH / BREAK
KEYNOTE 1, Moderator: Andrey Kutuzov
14:00–15:00 Can historical semantics save lives? (And other questions for computational diachronic semantics) Prof. Dirk Geeraerts
15:00–15:30 Deconstructing destruction: A Cognitive Linguistics perspective on a computational analysis of diachronic change Karlien Franco, Mariana Montes, Kris Heylen
15:30–17:00 On-site poster session and coffee See list below
17:00 Closing and dinner/drinks for anyone who wants to join

Day 2 -- May 27th

Start–End Title Author(s) Link(s)
9:30–9:40 Introduction Workshop organisers
KEYNOTE 1, Moderator: Syrielle Montariol
09:40–10:40 Human and Computational Measurement of Lexical Semantic Change Dominik Schlechtweg
10:40–11:05 BREAK
11:05–11:25 Shared task description paper (LSCDiscovery: A shared task on semantic change discovery and detection in Spanish) Frank Zamora-Reina, Felipe Bravo-Marquez, Dominik Schlechtweg
11:25–11:45 Best shared task paper: DeepMistake at LSCDiscovery: Can a Multilingual Word-in-Context Model Replace Human Annotators? Daniil Homskiy and Nikolay Arefyev
11:45–12:05 Best shared task paper: GlossReader at LSCDiscovery: Train to Select a Proper Gloss in English -- Discover Lexical Semantic Change in Spanish Maxim Rachinskiy and Nikolay Arefyev
12:35–14:30 LUNCH/BREAK
Session 2
14:30–15:30 VIRTUAL POSTER SESSION + COFFEE See list below
15:00–16:00 Mentoring
16:00 Closing remarks


Poster presentations:

  • Language Acquisition, Neutral Change, and Diachronic Trends in Noun Classifiers -- Aniket Kali, Jordan Kodner
  • From qualifiers to quantifiers: semantic shift at the paradigm level -- Quentin Feltgen
  • Explainable Publication Year Prediction of Eighteenth Century Texts with the BERT Model -- Iiro Rastas, Yann Ciarán Ryan, Iiro Tiihonen, Mohammadreza Qaraei, Liina Repo, Rohit Babbar, Eetu Mäkelä, Mikko Tolonen, Filip Ginter
  • Caveats of Measuring Semantic Change of Cognates and Borrowings using Multilingual Word Embeddings -- Clémentine Fourrier, Syrielle Montariol
  • "Vaderland", "Volk" and "Natie": Semantic Change Related to Nationalism in Dutch Literature Between 1700 and 1880 Captured with Dynamic Bernoulli Word Embeddings -- Marije Timmermans, Eva Vanmassenhove, Dimitar Shterionov
  • Using neural topic models to track context shifts of words: a case study of COVID-related terms before and after the lockdown in April 2020 -- Olga Kellert, Md Mahmud uz Zaman
  • Roadblocks in Gender Bias Measurement for Diachronic Corpora -- Saied Alshahrani, Esma Wali, Abdullah R Alshamsan, Yan Chen, Jeanna Matthews
  • Using neural topic models to track context shifts of words: a case study of COVID-related terms before and after the lockdown in April 2020. -- Olga Kellert, Md Mahmud uz Zaman
  • A Multilingual Benchmark to Capture Olfactory Situations over Time. -- Stefano menini, teresa paccosi, Sara Tonelli, Marieke van Erp, Inger Leemans, Pasquale Lisena, Raphael Troncy, William Tullett, Ali Hürriyetoğlu, Ger Dijkstra, Femke Gordijn, Elias Jürgens, Josephine Koopman, Aron Ouwerkerk, Sanne Steen, Inna Novalija, Janez Brank, Dunja Mladenic, Anja Zidar
  • Penn-Helsinki Parsed Corpus of Early Modern English: First Parsing Results and Analysis (NAACL2022 Findings) -- Seth Kulick, Neville Ryant, Beatrice Santorini
  • Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang (ACL2022) -- Keidar, Daphna ; Opedal, Andreas ; Jin, Zhijing ; Sachan, Mrinmaya

Shared task poster presentations:

  • HSE at LSCDiscovery in Spanish: Clustering and Profiling for Lexical Semantic Change Discovery -- Kseniia Kashleva, Alexander Shein, Elizaveta Tukhtina and Svetlana Vydrina
  • CoToHiLi at LSCDiscovery: the Role of Linguistic Features in Predicting Semantic Change -- Ana Sabina Uban, Alina Maria Cristea, Anca Daniela Dinu, Liviu P Dinu, Simona Georgescu and Laurentiu Zoicas
  • UAlberta at LSCDiscovery: Lexical Semantic Change Detection via Word Sense Disambiguation -- Daniela Teodorescu, Spencer von der Ohe and Grzegorz Kondrak
  • BOS at LSCDiscovery: Lexical Substitution for Interpretable Lexical Semantic Change Detection -- Artem Kudisov and Nikolay Arefyev

Sponsors

We gratefully acknowledge the contribution of iguanodon.ai as gold sponsor.

Student sponsorship:

Thanks to iguanodon.ai, we are sponsoring the registration fees for the ACL conference, including the yearly ACL membership fee, for four students.

Workshop Topics

This workshop explores state-of-the-art computational methodologies, theories and digital text resources on exploring the time-varying nature of human language.

The aim of this workshop is three-fold. First, we want to provide pioneering researchers who work on computational methods, evaluation, and large-scale modelling of language change an outlet for disseminating cutting-edge research on topics concerning language change. We want to utilize this proposed workshop as a platform for sharing state-of-the-art research progress in this fundamental domain of natural language research.

Second, in doing so we want to bring together domain experts across disciplines. by connecting researchers in historical linguistics with those that develop and test computational methods for detecting semantic change and laws of semantic change; and those that need knowledge (of the occurrence and shape) of language change, for example, in digital humanities and computational social sciences where text mining is applied to diachronic corpora subject to e.g., lexical semantic change.

Third, the detection and modelling of language change using diachronic text and text mining raise fundamental theoretical and methodological challenges for future research.

Besides these goals, this workshop will also support discussion on the evaluation of computational methodologies for uncovering language change. SemEval2020 Task1 on unsupervised detection of lexical semantic change attracted three figure submission numbers and a total of 21 submitted system papers. Since then, two more tasks have been completed, and we will organize the shared task on Spanish as a part of this workshop. Timeline for the shared task will be released shortly.

We invite original research papers from a wide range of topics, including but not limited to:

  • Novel methods for detecting diachronic semantic change and lexical replacement
  • Automatic discovery and quantitative evaluation of laws of language change
  • Computational theories and generative models of language change
  • Sense-aware (semantic) change analysis
  • Diachronic word sense disambiguation
  • Novel methods for diachronic analysis of low-resource languages
  • Novel methods for diachronic linguistic data visualization
  • Novel applications and implications of language change detection
  • Quantification of sociocultural influences on language change
  • Cross-linguistic, phylogenetic, and developmental approaches to language change
  • Novel datasets for cross-linguistic and diachronic analyses of language

Shared Task

The shared task on semantic change detection for Spanish will have two subtasks organized in two phases respectively:

  • discovery, and
  • binary change detection (detection of sense gain or loss vs. neither).

Note that discovery introduces additional difficulties for models as compared to the more simple semantic change detection, e.g. because a large number of predictions is required and the target words are not preselected, balanced or cleaned. Yet, discovery is an important task, with applications such as lexicography where dictionary makers aim to cover the full vocabulary of a language.

The full task description is available on CodaLab.

Task organizers: Frank D. Zamora-Reina, Felipe Bravo-Marquez, Dominik Schlechtweg.

Keynote Talks

We have two confirmed keynotes, Prof. Dirk Geeraerts and Dominik Schlechtweg.

Dirk Geeraerts (KU Leuven)
Title of talk: Can historical semantics save lives? (And other questions for computational diachronic semantics)
Abstract: Drawing on a number (methodologically non-computational) diachronic semantic studies that I have carried out at various points over the past decades, I would like to draw the attention to three issues that have so far played only a secondary role in the booming field of computational diachronic semantics but that might provide some inspiration for a further expansion: first, the double-sided status of textual interpretation, which can feature both as a descriptive target and as a methodological source in historical semantics; second, the relevance of incorporating an onomasiological dimension in the definition of semantic change; and third, the distinction between generalizations about semantic change that are formulated in terms of structural and functional features (like isomorphism or frequency) and generalizations that correlate semantic changes with external phenomena (like societal changes).

Dominik Schlechtweg (University of Stuttgart/University of Texas, Austin)
Title of talk: Human and Computational Measurement of Lexical Semantic Change
Abstract: Human language changes over time. This change occurs on several linguistic levels such as grammar, sound or meaning. The study of meaning changes on the word level is often called Lexical Semantic Change (LSC) and is traditionally either approached from an onomasiological perspective asking by which words a meaning can be expressed, or a semasiological perspective asking which meanings a word can express over time. In recent years, the task of automatic detection of semasiological LSC from textual data has been established as a proper field of computational linguistics under the name of Lexical Semantic Change Detection (LSCD). Two main factors have contributed to this development: (i) the *digital turn* in the humanities has made large amounts of historical texts available in digital form. (ii) New *computational models* have been introduced efficiently learning semantic aspects of words solely from text. One of the main motivations behind the work on LSCD are their applications in historical semantics and historical lexicography where researchers are concerned with the classification of words into categories of semantic change. Automatic methods have the advantage to produce semantic change predictions for large amounts of data in small amounts of time and could thus considerably decrease human efforts in the mentioned fields, while being able to scan more data and thus to uncover more semantic changes which are at the same time less biased towards ad hoc sampling criteria used by researchers. On the other hand, automatic methods may also be hurtful when their predictions are biased, i.e., they may miss numerous semantic changes or label words as changing which are not. Results produced in this way may then lead researchers to make empirically inadequate generalizations on semantic change. Hence, automatic change detection methods should not be trusted until they have been evaluated thoroughly and their predictions have been shown to reach an acceptable level of correctness. Despite the rapid growth of LSCD as a field a solid evaluation of the wealth of proposed models was still missing in 2017. The reasons were multiple, but most importantly there was no annotated benchmark test set available. In this talk I will describe the work done for my PhD from the last five years aimed at standardizing the evaluation of LSCD models

Submissions

We accept three types of submissions, long and short papers, following the ACL2022 style, and the ACL submission policy, and shared task papers.

Long papers may consist of up to eight (8) pages of content, plus unlimited references, short papers may consist of up to four (4) pages of content; final versions will be given one additional page of content so that reviewers' comments can be taken into account. Shared task papers may consist of up to four (4) pages plus unlimited references, but without an additional page upon acceptance. Overleaf templates are available.

Submissions should be sent in electronic forms, using ACL Rolling Review (ARR). The submission site is now open.

The workshop is planned to last two full days. Submissions are open to all, and are to be submitted anonymously. Workshop papers will be refereed through a double-blind peer review process by at least three reviewers with final acceptance decisions made by the workshop organizers. Shared task participants can choose to submit their papers to the shared task where papers will be reviewed by other task participants, or to the main workshop where they will be reviewed according to the normal workshop proceedure.

Contact

Contact us if you have any questions.

Organisers: Nina Tahmasebi, Lars Borin, Simon Hengchen, Syrielle Montariol, Haim Dubossarsky and Andrey Kutuzov.

Anti-Harassment Policy

Our workshop highly values the open exchange of ideas, the freedom of thought and expression, and respectful scientific debate. We support and uphold the ACL Anti-Harassment policy, and any workshop participant should feel free to contact any of the workshop organisers or Priscilla Rasmussen, in case of any issues.

References:

Related