Coreference and thematic cores in the interviews of the corpus IS

Authors

  • Carolina Flinz University of Milano
  • Josef Ruppenhofer FernUniversität Hagen

DOI:

https://doi.org/10.6093/germanica.v0i33.10752

Keywords:

Israelkorpus, coreference, entities, thematic cores, cohesion

Abstract

The aim of the following contribution is to identify coreferences and thematic cores in the corpus Emigrantendeutsch in Israel (IS), using corpus analysis and manual annotation. We will focus on two interviews, that with Paul Avraham and Betti Alsberg (IS_00002) and that with Clara Bartnitzki (IS_00008) (see 2.1). The starting point for our considerations is the hypothesis that in interviews some referents, such as interviewees, appear independently of their respective thematic core, while other referents (and thus members of coreference chains), appear grouped in segments dedicated to specific thematic cores. Firstly, we believe that being able to automatically create coreference chains in narrative interviews is important in order to more easily identify and extract text segments in which certain entities or persons are mentioned. As already shown in Flinz/Ruppenhofer 2021, Named Entity Recognition (NER) systems can identify names of people or organizations. However, they usually do not associate such mentions with coreferential mentions that have the form of ordinary nouns or pronouns (see Li et al. 2020; Nadeau/Sekine 2007). Only additional annotation can generate coreferential chains. Secondly, coreference can be linked to the segmentation of interviews by thematic cores. Since it contributes to the cohesion of the text (see Halliday/Hasan 1976), we can make the hypothesis that the way in which the elements of the coreferential chains are distributed can provide clues to thematic segmentation. To this end, we would also like to examine whether the coreference structures of the interviews may reveal new facets not otherwise easily discernible.

Author Biography

Josef Ruppenhofer, FernUniversität Hagen

-

Published

2024-01-31

How to Cite

Flinz, C. and Ruppenhofer, J. (2024) “Coreference and thematic cores in the interviews of the corpus IS”, ANNALI. SEZIONE GERMANICA. Rivista del Dipartimento di Studi Letterari, Linguistici e Comparati dell’Università degli studi di Napoli L’Orientale, (33), pp. 383–414. doi: 10.6093/germanica.v0i33.10752.

Issue

Section

Special issue articles