Workshop on Automatic Detection of Language Change 2018

The workshop will be co-located with SLTC 2018 in Stockholm, on the 7th of November 2018.

Keynote Lena Rogström: “Two small, brown, elevated warts that undoubtedly are the breathing holes” A historical study of entomologic vocabulary In the Transactions of the Royal Swedish Academy of Sciences

Accepted presentations:

Our world changes, and with it we change our language. We learn new words, add new meanings to existing words and change meanings to better describe our time and culture. We forget fast and when looking back, for example in old newspaper material, these lexical and semantic changes make it difficult to understand what has been said. In addition, they hinder us when we want to apply text mining on historical corpora, for example, to track sentiments over time.

Automatic detection of language change is a field that has gotten increasing attention over the past decade. Due to digital corpora containing more data and spanning over more time, combined with new and powerful embedding technologies, methods for e.g., word sense change have become very popular.

Despite these initial efforts, we lack computational tools for studying lexical and semantic changes at a large scale. Current methods are limited in what they can find and methods for creating (neural) word embeddings that are the state-of-the-art in e.g., sense change detection require sufficiently large datasets. For (historical) Swedish and most other languages, the situation is different with fairly small-sized data with a high error rate.

This workshop aims to bring together a community of researchers in Sweden that focus on different aspects of language change detection, both from a qualitative and manual as well as a quantitative, automatic detection perspective. We believe that such a workshop is needed for Sweden to be in the forefront of this research, in particular since few others will aim to find solutions particular to Swedish. The workshop will include a poster session on general tools for language technology by SweClarin.

We invite presentations but no papers in this first round, to encourage participants from a wide range of fields. We intend a low-key workshop that will start with a keynote and continue with each participant getting a chance to present themselves and their work, to find possible collaborations (preferably across topics and fields) and better utilize existing efforts and datasets. We hope to bring together technology providers, data providers, and users such as researchers with interest in historical texts but with expertise outside of language technology (like digital humanities, historical linguistics, history of ideas, sociology, history etc.). After a coffee break and the presentations, we will have one more keynote and continue with some discussions and a planning session for collaboration and a further workshop.

There will be no published proceedings and the workshop is planned for half a day. Please contact nina . tahmasebi at gu . se or yvonne . adesam at gu . se to propose presentations, with a preliminary title and a short description (maximum of 1 written page). We are planning for presentations of 15-20 minutes for finished work and 5-10 minutes for ongoing work. Examples of welcomed talks include:

  • You are a linguist, anthropologist, historian etc. who is interested in how a certain word or a certain concept developed and changed in Swedish language or culture over time. You have done corpus studies, but mainly manual, and wish to explore if there are computational methods or collaborations that could add more value to your research.
  • You are a quantitative researcher with an interesting method that finds patterns in e.g. time series data, and wonder if there are good data sets and research questions to use this on.
  • You are a linguist with a theory about how semantic change proceeds or how new words are added to a language, looking for new methods to falsify your hypotheses.

Confirmed Speakers: Susanne Vejdemo, Nina Tahmasebi, Lena Rogström

The workshop will be held in Aula Magna at Stockholm University’s Frescati campus. The room in Aula Magna is called Bergsmannen, but there will be signs showing you to the way inside the building.

The workshop is organized by Språkbanken, Centre for Digital Humanities and SweClarin. Organizers: Nina Tahmasebi, Yvonne Adesam, Susanne Vejdemo.