1st International Workshop on Computational Approaches to Historical Language Change 2019

Nina Tahmasebi , Lars Borin , Adam Jatowt , Yang Xu

May 14, 2019

The workshop was co-located with ACL 2019 in Florence, on August 2nd, 2019. We received over 50 submissions! and had over 65 attendees. Thank you all for making the workshop such a success!

The full list of accepted papers, together with the posters and presentations can be found here. The introduction and closing slides are here: LChange’19 Intro & Closing.

Schedule

August 2nd 2019

9:00–9:15 Introduction
9:15–10:30 Session 1
9:15–10:00 Haim Dubossarsky: Semantic Change in the Time of Machine Learning, Doing it Right! (Keynote)
10:00–10:30 From Insanely Jealous to Insanely Delicious: Computational Models for the Semantic Bleaching of English Intensifiers
10:30–10:45 Coffee Break
10:45–12:30 Session 2
10:45–11:15 Computational Analysis of the Historical Changes in Poetry and Prose
11:15–11:35 Studying Semantic Chain Shifts with Word2Vec: FOOD>MEAT>FLESH
11:35–11:55 Evaluation of Semantic Change of Harm-Related Concepts in Psychology
11:55–12:25 Contextualized Diachronic Word Representations
12:30–13:30 Lunch Break
13:30–14:30 Session 3
13:30–14:30 Claire Bowern: Semantic Change and Semantic Stability: Variation is Key (Keynote)
14:30–16:00 Session 4 (Poster Session with coffee)
16:00–16:40 Session 5
16:00–16:20 Modeling a Historical Variety of a Low-resource Language: Language Contact Effects in the Verbal Cluster of Early-Modern Frisian
16:20–16:40 Visualizing Linguistic Change as Dimension Interactions
16:40–17:15 Discussion and Closing

Important Dates

April 26, 2019: Paper submission
May 24, 2019: Notification of Acceptance
June 3, 2019: Camera-ready papers due
August 2, 2019: Workshop Date

Keynote Talk

Confirmed Speaker:
Claire Bowern (Professor of Linguistics at Yale University)
Title of talk: Semantic Change and Semantic Stability: Variation is Key and the corresponding paper

Haim Dubossarsky (Research Fellow at University of Cambridge)
Title of talk: Semantic change in the time of Machine Learning, doing it right!

Workshop Topics

Human language changes over time, driven by the dual needs of adapting to ongoing sociocultural and technological development in the world and facilitating efficient communication. In particular, novel words are coined or borrowed from other languages, while obsolete words slide into obscurity. Similarly, words may acquire novel meanings or lose existing meanings. This workshop explores these phenomena by bringing to bear state-of-the-art computational methodologies, theories and digital text resources on exploring the time-varying nature of human language.

Although there exists rich empirical work on language change from historical linguistics, sociolinguistics and cognitive linguistics, computational approaches to the problem of language change particularly how word forms and meanings evolve have only begun to take shape over the past decade or so, with exemplary work on semantic change and lexical replacement . The motivation has long been related to search, and understanding in diachronic archives . The emergence of long-term and large-scale digital corpora was the prerequisite and has resulted in a slightly different set of problems for this strand of study than have traditionally been studied in historical linguistics. As an example, studies of lexical replacement have largely focused on named entity change (names of e.g., countries and people that change over time) because of the large effect these name changes have for temporal information retrieval.

The aim of this workshop is three-fold. First, we want to provide pioneering researchers who work on computational methods, evaluation, and large-scale modelling of language change an outlet for disseminating cutting-edge research on topics concerning language change. Currently, researchers in this area have published in a wide range of different venues, from computational linguistics, to cognitive science and digital archiving venues. We want to utilize this proposed workshop as a platform for sharing state-of-the-art research progress in this fundamental domain of natural language research.

Second, in doing so we want to bring together domain experts across disciplines. We want to connect those that have long worked on language change within historical linguistics and bring with them a large understanding for general linguistic theories of language change; those that have studied change across languages and language families; those that develop and test computational methods for detecting semantic change and laws of semantic change; and those that need knowledge (of the occurrence and shape) of language change, for example, in digital humanities and computational social sciences where text mining is applied to diachronic corpora subject to lexical semantic change.

Third, the detection and modelling of language change using diachronic text and text mining raise fundamental theoretical and methodological challenges for future research in this area. The representativeness of text is a first critical issue; works using large diachronic corpora and computational methods for detecting change often claim to find changes that are universally true for a language as a whole. But the jury is out on how results derived from digital literature or newspapers accurately represent changes in language as a whole. We hope to engage corpus linguists, big-data scientists, and computational linguists to address these open issues. Besides these goals, this workshop will also support discussion on the evaluation of computational methodologies for uncovering language change. Verifying change only using positive examples of change often confirms a corpus bias rather than reflecting genuine language change. Larger quantities and higher qualities of text over time result in the detection of more semantic change. In fact, multiple semantic laws have been proposed lately where later other authors have shown that the detected effects are linked to frequency rather than underlying semantic change . The methodological issue of evaluation, together with good evaluation testsets and standards are of high importance to the research community. We aim to shed some light on these issues and encourage the community to collaborate to find solutions.

The work in semantic change detection¹ has, to a large extent, moved to (neural) embedding techniques in recent years . These methods have several drawbacks: the need for very large datasets to produce stable embeddings, and the fact that all semantic information of a word is encoded in a single vector thus limiting the possibility to study word senses separately. A move towards multi-sense embeddings will most likely require even more texts per time unit, which will limit the applicability of these methods to other languages than English and a few others. We want to bring about a discussion on the need for methods that can discriminate and disambiguate among a word's senses (meanings) and that can be used for resource-poor languages with little hope of acquiring the order of magnitude of words needed for creating stable embeddings, possibly using dynamic embeddings that seem to require less text . Finally, knowledge of language change is useful not only on its own, but as a basis for other diachronic textual investigations and in search.

A digital humanities investigation into the living conditions of young women through history cannot rely on the word girl in English, as in the past the reference of girl also included young men. Automatic detecting of language change is useful for many researchers outside of the communities that study the changes themselves and develop methods for their detection. By reaching out to these other communities, we can better understand how to utilize the results for further research and for presenting them to the interested public. In addition, we need good user interfaces and systems for exploring language changes in corpora, for example, to allow for serendipitous discovery of interesting phenomena . In addition to facilitate research on texts, information about language changes is used for measuring document across-time similarity, information retrieval from long-term document archives, the design of OCR algorithms and so on.

We invite original research papers from a wide range of topics, including but not limited to:

Automatic detection of semantic change and diachronic lexical replacement
Fundamental laws of language change
Computational theories and generative models of language change
Sense-aware (semantic) change analysis
Methodologies for resource-poor languages
Diachronic linguistic data visualization and online systems
Applications and implications of language change detection
Sociocultural influences on language change
Cross-linguistic and phylogenetic approaches to language change
Methodological aspects of, as well as datasets for, evaluation

Submissions

We accept three types of submissions, long papers, short papers and abstracts, following the ACL2019 style, and the ACL submission policy.

Long papers may consist of up to eight (8) pages of content, plus unlimited references, short papers may consist of up to four (4) pages of content; final versions will be given one additional page of content so that reviewers’ comments can be taken into account. Abstracts may consist of up to two (2) pages of content, plus unlimited references.

Submissions should be sent in electronic forms, using the Softconf START conference management system. The submission site is now available.

The workshop is planned to last a full day. Submissions are open to all, and are to be submitted anonymously. All papers will be refereed through a double-blind peer review process by at least three reviewers with final acceptance decisions made by the workshop organizers. We plan to edit a book on the basis of extended workshop papers and are currently discussing the publication with a publisher.

Programme Committee

Nicholas A.Lester	Stian Rødven Eide	Bill Noble
Yvonne Adesam	Antske Fokkens	Kjetil Norvag
Rami Aly	Mats Fridlund	Ella Rabinovich
Avishek Anand	Michael Färber	Taraka Rama
Timothy Baldwin	Johannes Hellrich	Jacobo Rouces
Pierpaolo Basile	Simon Hengchen	Sylvie Saget
Barend Beekhuizen	Louise Holmer	Eyal Sagi
Meriem Beloucif	Mika Hämäläinen	Asad Sayeed
Klaus Berberich	Abhik Jana	Dominik Schlechtweg
Aleksandrs Berdicevskis	Péter Jeszenszky	Vidya Somashekarappa
Chris Biemann	Dirk Johannßen	Andreas Spitz
Damian Blasi	Richard Johansson	Ian Stewart
Ricardo Campos	Antti Kanner	Suzanne Stevenson
Annalina Caputo	Tom Kenter	Barbro Wallgren Hemlin
Brady Clark	Jey Han Lau	Susanne Vejdemo
Paul Cook	Liina Lindström	Mikael Vejdemo Johansson
Dana Dannells	Behrooz Mansouri	Melvin Wevers
Pavel Denisov	Animesh Mukherjee	Guanghao You
Haim Dubossarsky	Luis Nieto Piña	Yihong Zhang

Contact

Organizers: Nina Tahmasebi, Lars Borin, Adam Jatowt, and Yang Xu.

Anti-Harassment Policy

Our workshop highly values the open exchange of ideas, the freedom of thought and expression, and respectful scientific debate. We support and uphold the ACL Anti-Harassment policy, and any workshop participant should feel free to contact any of the workshop organisers or Priscilla Rasmussen, in case of any issues.

Often, the work from the computational community has a wider take on semantic change than traditional historical linguistics, for example, with novel words and senses as well as change to the senses themselves as a part.