Ernestus Corpus of Spontaneous Dutch

The Ernestus Corpus of Spontaneous Dutch contains high quality recordings of 10 conversations, each 90 minutes long, between friends or direct colleagues. The corpus was recorded between autumn 1995 and spring 1996 at the Institute of Phonetics of the University of Amsterdam.
Professional transcribers have created an orthographic transcription of the corpus by hand, while a phonemic transcription has been created automatically. Both types of transcriptions are stored in Praat TextGrid format.

The corpus is available to researchers in academics. If you would like to obtain a copy of the corpus, you can contact Rian Zondervan by e-mail Rian.Zondervan@mpi.nl

A detailed description of the corpus is provided in:

  • M. Ernestus (2000). Voice assimilation and segment reduction in casual Dutch: A corpus-based study of the phonology-phonetic interface. Holland Institute of Generative Linguistics, Utrecht.[pdf]

A description of the automatic generation of the phonemic transcription can be found in:

  • B. Schuppler, M. Ernestus, O. Scharenborg, & L. Boves (2011). Acoustic reduction in conversational Dutch: A quantitative analysis based on automatically generated segmental transcriptions. Journal of Phonetics 39, 96-109.[pdf]