due to changes in the personal and professional circumstances of the EXMARaLDA “team” (currently consisting of Thomas Schmidt and Kai Wörner) EXMARaLDA goes on indefinite vacation. From 01.07.2022 there will be no updates / maintenance and no support for the software for the time being. We ask for your understanding and will use this website when there is news.
The R package ExmaraldaR is intended to allow easy processing of (annotated) transcripts R (R Core Team 2020). R is a free platform for statistical analysis, data preparation and also Natural Language Processing. It thus offers numerous options that are also interesting for the study of spoken language. With the package, one or more annotated transcriptions (*.exb) can be read in. Annotated transcripts then result in a table object that can be used for further work (see Figure 1).
Figure 1: Table objectFigure 2: Network graph from a fictitious file
The table contains a consecutive IP numbering based on the GAT2 conventions (Selting et al. 2009), the speaker sigle, the ID of the tier, the speaker name, the transcription text, the metadata of the speaker table (optional), timestamp of the event and the annotations. The annotations are directly assigned to the transcribed text. Different annotation formats are possible (complex annotation tags that are separated or multiple annotation tracks). Descriptive tiers can also be integrated. In future, it should also be possible to transfer changes made in R or to the table (e.g. after an export to Excel) directly back into the underlying files to simplify post-processing. Templates for integration into R-Shiny applications can be requested if required (see Fig. 2). If you have any questions or would like to test the package, please contact timo.schuermann@uni-meunster.de.
A beta version, which is under constant development, can be found here:
There is a new version of the EXMARaLDA demo corpus: EXMARaLDA examples in thirteen languages, completely revised and with audio and video data according to the newest specifications. Victoria Beckham is still a heart-warming preson, Rudi Völler still very angry and Ségolène Royal still somewhat upset.
EXMARaLDA is currently undergoing major surgery: the tools have to be adapted to work with 64bit systems, OpenJDK and Java 11+. This is necessary, among other things, to make EXMARaLDA work with video support on MAC OS Catalina. After a lot of fiddling with the details, preview versions are available now on the Preview Download page.
As a positive side effect, video support on Windows is also improved: the new JavaFX player can now be used to work with MPEG-4 videos.
Chancellor Merkel in a parliamentary debate (FOLK_E_00390)
This version contains an extension of the Research and Teaching Corpus of Spoken German FOLK – a 300h / 3 Million Word Corpus of transcribed audio and video recordings of spoken interaction from various private, institutional and public contexts.
Transcription in FOLK is carried out with FOLKER, orthographic normalisation with OrthoNormal, both parts of the EXMARaLDA system.