corpus_install()gives much better and nicer reports on steps performed during corpus downloads. User dialogues have been reworked thoroughly to provide better user guidance.
use_corpus_registry_envvar()function is called by
corpus_install()and will amend the .Renviron file as appropriate if the user so desires.
corpus_testload()has been implemented to check whether a (newly installed) corpus is accessible.
jsonlite::fromJSON(). The auxiliary function to get and process information from Zenodo now ensures that newline characters are escaped such that they can be processed.
corpus_copy()function did not set the path to the info file to the new data directory - corrected.
corpus_install()function failed when the
NULLvalue from the default call to
cwbtools::cwb_registry_dir(). But if the directories are created, the registry directory is there. Fixed.
registry_file_compose()when the path includes any whitespace characters.
install_corpus()function has been reworked thoroughly. Using system directories for the registry and the corpus directory is now supported. This is a prerequisite that corpora can be installed outside of R packages Installing corpora within corpora is not allowed by CRAN.
cwb_corpus_dir()) will get the whereabouts of the registry directory and the corpus directory. In particular, they consider that the polmineR package may have generated a temporary corpus registry, resetting the CORPUS_REGISTRY environment variable.
install_corpus()function accepts an argument
doito provide a Document Object Identifier (DOI). At this stage, the DOI is assumed to be awarded by Zenodo. Information available at the Zenodo site will be resolved to get the URL of a corpus tarball that can be downloaded. Upon installing a corpus from Zenodo, the DOI and the version number will be written as corpus properties into the registry file.
corpus_install()function will ask the user for feedback if a corpus would be installed that is already present and that would be deleted or overwritten.
use_corpus_registry_envvar()will assist users to create the required directory structure for CWB indexed corpora.
pkg_add_corpus()function will now create the cwb directories (registry and data directory) if necessary. Previously, these directories were required to exist before moving a corpus into a package, making it necessary to put dummy files into packages to keep R CMD build from issuing warnings and git from dropping these directories. Creating the directories on demand is a precondition for a CRAN release of data packages (#11).
matrixclass will inherit from class
array. The new package version now takes into account that
length(class(matrix(1:4,2,2)))will return the value 2.
pkgdown::build_site()will generate a proper changelog page.
curl::curl_download()for Windows because curl apparently is not able to process target filenames that include special characters.
decode()-method will turn a
Annotationobject from the NLP package.
conll_get_regions()-function will turn an CoNLL-style annotated token stream into a table with regions that can be encoded using
s_attribute_merge()will merge two
data.tableobjects defining s-attributes, checking for overlaps.
s_attribute_recode(), and supplementary
tempdir()is now wrapped as
normalizePath(tempdir(), winslash = "/")to avoid Problems on Windows, when different file separators may be used.
file.path(), the argument
fsepis “/” to prevent confusion of file seperators.
corpus_copy()is available to create a copy a corpus.
cl_delete_corpus()from RcppCWB is added to
s_attribute_encode(), so that newly added s-attributes can be used without restarting the R session.
corpus_copy()was defined (and documented) twice in a confusing manner. This is cleaned up.
installed.packages()were replaced to meet an advice of the CRAN team in the submission process.
CorpusData$add_corpus_positions()(helper function .fn)
install_corpus(), if argument tarball is specified. This is a precondition for passing arguments to download password-protected corpora.