Nijmegen Corpus of Casual French

Corpus collection

The creation of the corpus was initiated in November 2007. Twenty-three confederates were recruited at the University of Paris 3 Sorbonne Nouvelle, either by e-mail or personally. Each confederate brought two friends to the recording session. Both the confederate and the two other participants complied with the following requirements:

  • They knew the two other participants in the recording well.
  • They were of the same sex as the two other participants in the recording.
  • They had completed the secondary education cycle in France.
  • They had been raised in Central/Northern France.

The recordings took place in a sound-attenuated room at the Institut de Linguistique et Phonétique Générales Appliquées (ILPGA) in Paris in sessions of around 90 minutes for each group of participants. Each of the two naïve speakers participating in a conversation was recorded in a separate audio channel of a stereo signal. Confederates were not recorded. The participants were placed in such a way that only the two naïve speakers were filmed. This is illustrated in the following image:


Casual speech was elicited during three different parts. In Part 1, we pretended that the confederate's microphone did not work properly and asked her to leave the room. This resulted in an unexpected situation in which the naïve speakers did not know with certainty whether the recording had begun. The conversation then held by the two naïve speakers was recorded for 20 minutes. Part 2, which lasted around 35 minutes on average, consisted of a free conversation between the confederates and their friends. Part 3 required participants to choose three questions from a list of general interest questions, and to negotiate a common position for their group. Part 3 had an average duration of 35 minutes.