User story: ExmaraldaR – Processing (Annotated) Transcripts in R

The R package ExmaraldaR is intended to allow easy processing of (annotated) transcripts R (R Core Team 2020). R is a free platform for statistical analysis, data preparation and also Natural Language Processing. It thus offers numerous options that are also interesting for the study of spoken language. With the package, one or more annotated transcriptions (*.exb) can be read in. Annotated transcripts then result in a table object that can be used for further work (see Figure 1). The table contains a consecutive IP numbering based on the GAT2 conventions (Selting et al. 2009), the speaker sigle, the ID of the tier, the speaker name, the transcription text, the metadata of the speaker table (optional), timestamp of the event and the annotations. The annotations are directly assigned to the transcribed text. Different annotation formats are possible (complex annotation tags

Read more

EADH Workshop: Annotation of digital oral data collections in the Humanities and Social Sciences

The workshop  “Annotation of digital oral data collections in the Humanities and Social Sciences“  is one of the nine workshops taking place during the conference “Data in Digital Humanities“, hosted by the European Association for Digital Humanities (EADH) at the National University of Ireland, Galway on 9/Dec/2018. The content of this workshop reads as follows In many scientific fields, ranging from phonetics, applied linguistics or discourse analysis, to literary studies, sociology and history among others, annotation is the common ground for systematic and empirical analysis of oral data. While the structure and the theoretical basis for the annotation and the preferred methods of analysis might differ, the main aspects and the specific conditions pertaining to the modality of the data are shared across disciplines. In this workshop, we will first give an introduction to theoretical issues and frameworks relevant to annotation

Read more