cwbtools v0.3.0 ‘Apple Picker’ Released
A new version of cwbtools (v0.3.0) is available via CRAN and I do not want to miss explaining what is new.
First of all, the major new feature is that it is now possible to
download the tarball with a CWB indexed corpora from a server and to
install corpus files to the general corpus storage. The whereabouts of a
corpus can also be stated via a Document Object Identifier (DOI),
i.e. there is a new argument
doi for the
At this stage, resolving the DOI to get the URL of the corpus tarball is implemented for DOIs issued by Zenodo (using the zen4R package, which is a new dependency). Zenodo is an open science repository hosted by CERN. Several corpora prepared in the PolMine Project have been published with Zenodo recently. As a result, it has never been easier and more reliable to install corpora.
For instance, to install the United Nations General Assembly (UNGA) corpus, use this function call:
install.packages("cwbtools") # cwbtools v0.3.0 required corpus_install(doi = "10.5281/zenodo.3831472")
The usability of Zenodo for depositing data is outstanding. As a DOI is
issued upon uploading data, the service is comfortable and appropriate
for scientific data at the same time. So we think that Zenodo is a very
good option to establish the accessibility of corpora (in line with FAIR
principles). It is perfectly open for anybody who wants to publish data.
cwbtools::corpus_install() will work to download any tarball
with a CWB indexed corpus from Zenodo. Preparing and uploading the
tarballs is not difficult at all.
Concerning usability, user dialogues in the cwbtools package have been reworked thoroughly. We started to use the cli package to create a better command line interface. Beyond a nicer appearance and more informative messages, user dialogues that will guide a user through the installation of a corpus have been rewritten and extended.
There is a new strong support to store corpora in the system corpus
storage. If the respective directory structure is not yet present, the
user will be guided through the process of creating all directories that
are needed. Last but not least, defining the
.Renviron file is supported by a user dialogue, so that corpora
are available across sessions without further ado.
It is quite some work that has gone into the new release of cwbtools. But I am quite confident that the user experience may be much better than before. As always, we will be happy to learn about your experiences and suggestions.
One final remark. Why is this release called “Apple Picker”? cwbtools v0.3.0 is about making downloading and installing corpora as comfortable as possible. I thought that an apple picker was a nice metaphor for that.
Subscribe via RSS