Nijmegen Corpus of Casual Czech

Corpus Collection

The creation of the corpus was initiated in November 2008.Twenty confederates were recruited at Charles University by e-mail or personally. Each confederate brought two friends to the recording session. Both the confederate and the two other participants complied with the following requirements:

  • They knew the two other participants in the recording well.
  • They were of the same sex as the two other participants in the recording.
  • They had finished high school.
  • They had been raised in Prague or the central part of Bohemia.
  • They were between 19 and 26 years old.

The recordings took place in a sound-attenuated room at the Phonetic Institute of Charles University in Prague in sessions of around 90 minutes for each group of participants. Each of the two naive speakers participating in a conversation was recorded in a separate audio channel of a stereo signal, while the confederate was recorded separately in a mono audio stream. The participants were placed in such a way that only the two naive speakers were filmed. This is illustrated in the following image:

Casual speech was elicited during three different parts. Before the start of Part 1, the confederate pretended to have received an important message and left the room. This resulted in an unexpected situation for the naive speakers who did not know with certainty whether the recording had begun. The conversation then held by the two na誰ve speakers was recorded for on average 18 minutes (depending on the liveliness of their conversation). Part 2, which lasted around 45 minutes on average, consisted of a free conversation between the confederates and their friends. Part 3 required participants to choose three questions from a list of general interest questions, and to negotiate a common position for their group. Part 3 had an average duration of 30 minutes.