The GermaParl R package includes only a small subset of the GermaParl corpus
(GERMAPARLMINI). The full corpus is deposited with
Zenodo, an open science repository for research
data. The germaparl_download_corpus
function downloads a tarball with
the indexed corpus from the Zenodo repository and moves the corpus data to
the system corpus storage. If a corpus registry has not yet been created, an
interactive dialogue will assist doing so. When calling the function, a
stable internet connection is recommended. The size of the data to be
downloaded is about 1 GB.
germaparl_download_corpus(
doi = "https://doi.org/10.5281/zenodo.3742113",
registry_dir = cwb_registry_dir(),
corpus_dir = cwb_corpus_dir(registry_dir),
verbose = interactive(),
ask = interactive(),
sample = FALSE
)
The DOI (Digital Object Identifier) of the GermaParl tarball at zenodo, presented as a hyperlink. Defaults to the latest version of GermaParl.
Path to the system registry directory. Defaults to value
of cwbtools::cwb_registry_dir()
to guess the registry directory.
We recommend to state the registry directory explicitly.
Directory where data directories of corpora are located. By
default, the directory is guessed using cwbtools::cwb_registry_dir
.
We recommend to state the directory explicitly.
Whether to show messages, defaults to TRUE
.
A logical
value, whether to ask for user input before
replacing an existing corpus.
A logical
value, whether to download sample data
(GERMAPARLSAMPLE) rather than full corpus (GERMAPARL) for testing purposes.
Logical value. TRUE
if the corpus has been installed
successfully.
After downloading and installing the tarball with the CWB indexed
corpus, the registry file for the GERMAPARL corpus will be amended by the
DOI and the corpus version. Afterwards, this information is available for a
citation information fitting the corpus used that is provided when calling
citation(package = "GermaParl")
.
An example for using the germaparl_download_corpus
function
is part of the examples section of the overview documentation of the
GermaParl package.