A Corpus of Plenary Protocols
A short introduction to MigParl is also available in German.
Text is central in political discourse. Making text resources available in a format which is useful for both academia and the public is a requirement for a huge number of beneficial applications. With MigParl, we provide one such resource. This document introduces the corpus. We outline available data, the data preparation process for preparing corpora of parliamentary debates and the selection strategy used to obtain a thematically coherent corpus of debates concerned with migration and integration.
The corpus is stored in the open-access repository Zenodo. A brief introduction on how to download and use the data is provided in the section below on this page. A more detailed introduction is presented in chapter 4.
The data comes with a CC BY+SA license. If you work with the MigParl corpus, please include the reference which corresponds to the corpus version you are using in your bibliography to attribute the language resource. The quotation for the most recent version of the corpus is:
Blätte, Andreas and Leonhardt, Christoph (2020). MigParl. A Corpus of Speeches on Migration and Integration in Germany’s Regional Parliaments. Version: 2020.01.27. DOI: https://zenodo.org/api/records/3872263.
A very brief practical Introduction
A CWB-Corpus for polmineR
The MigParl corpus is designed to be used with polmineR as a toolset for various standard qualitative and quantitative tasks in text analysis (count, dispersion, ngrams, cooccurrences, viewing concordances as well as going back to the original full-text). Using polmineR, you can easily generate data structures (such as term-document matrices) that are required as input for advanced statistical procedures.
To harness to full potential of MigParl, make sure to have polmineR installed.
To install the corpus, the R package
cwbtools which is from the PolMine family of packages, needs to be installed as well. Amomg other things, it facilitates the easy management of corpora.
Afterwards, the corpus can be installed from the Zenodo repository. Multiple versions of the corpus are available on Zenodo. To install the most recent version of the corpus, the following Digital Object Identifier must be indicated.
To check whether the installation has been successful, run the following commands. For further instructions, see the documentation of polmineR.
## corpus size template ## 3 MIGPARL 51470707 TRUE
##  51470707
We hope that MigParl (and polmineR) will inspire your research and make it more productive. We would be glad to learn what you do with the data, and make your blog entries or publications visible.
And please do not forget to bring issues that you come across to our attention. We cordially invite you to use the GitHub issues of this documentation to report bugs, shortcomings and to suggest enhancements. Improving data quality is an important concern of the PolMine Project, this is why the data is versioned. The resource will benefit from its community of users and your feedback!