Leibniz-Zentrum Allgemeine Sprachwissenschaft Leibniz-Gemeinschaft

Research Areas


Minicourse: Introducing 'Translation Mining'

Organizer(s) Bert Le Bruyn & Henriëtte de Swart
Affiliaton(s) Utrecht University
Start of event 28.02.2022, 10.00 o'clock
End of event 02.03.2022, 12.30 o'clock
Venue Hybrid

Introducing Translation Mining | Bert Le Bruyn & Henriëtte de Swart


In this mini-course, we discuss the methodologies that have been used in cross-linguistic semantics over the past two decades. The fundamental issue we are concerned with is the balance between data and theory: how much should we allow a theory based on one (set of) language(s) to guide our analysis of another (set of) language(s)? Up till recently, there was no other way to build a theory of language than by moving language by language and verifying/falsifying hypotheses based on previously studied languages. We argue that advances in parallel corpus research allow us to proceed in a more data-driven fashion and to question the status quo in how we make theoretical progress in cross-linguistic semantics.


This mini-course summarizes the methodological insights of Time in Translation, an Utrecht-based NWO-funded project (2017-2022) in which we’ve had the opportunity to explore the potential and the limitations of translation corpora for fine-grained cross-linguistic research. This has resulted in an approach to translation-corpus research that differs from previous ones in corpus architecture, analysis, and research cycle. We refer to this approach as Translation Mining.

The methodological insights will be illustrated on the basis of phenomena that we have dealt with in the project. These include reference (e.g., Bremmers et al. 2021) and tense/aspect (e.g., van der Klis et al. 2021). Language-wise, we’ll draw on Western European languages and add Mandarin as an example of a typologically more distant language.

Organization & Access

The mini-course consists of 3 sessions of about 2,5 hours and adopts a hands-on approach. The first session introduces Translation Mining and situates it both within the broader paradigm of empirical methods and within the narrower paradigm of translation-corpus-based approaches. The second session presents two longer applications of Translation Mining, one on reference and one on tense/aspect. The third and final session reflects on what we can already achieve with translation corpora and looks ahead at what we hope to accomplish in the foreseeable future.

The class will be hybrid, which means participation is possible both online and in person. If you are interested in participating, please send an email with the subject line "ZAS Minicourse" to Bert Le Bruyn at B.S.W.LeBruyn@uu.nl. You will then be provided with access to the class meetings and materials.


Session 1: February 28 | 10.00-12.30

Session 2: March 1 | 10.00-12.30

Session 3: March 2 | 10.00-12.30

References in this announcement

Bremmers, D., Liu, J., van der Klis, M., & Le Bruyn, B. (2021). Translation Mining: Definiteness across Languages—A Reply to Jenks (2018). Linguistic Inquiry, 1-30. [available in OnlineEarly]

Van der Klis, M., Le Bruyn, B., & de Swart, H. (2021). A multilingual corpus study of the competition between past and perfect in narrative discourse. Journal of Linguistics, 1-35. [available in FirstView]