The Corpus Workbench (CWB) uses a registry directory with plain text files describing corpora in a standardized format. The binary files of a corpus are stored in a data directory defined in the registry directory. The registry and data_dir functions return the respective direcories within a package, if the argument pkg is used, or the temporary registry and data directory in the per-session temporary directory, if pkg is NULL (default value).

registry_move(corpus, registry, registry_new, home_dir_new)

registry(pkg = NULL)

data_dir(pkg = NULL)

Arguments

corpus

The ID of the corpus for which the registry file shall be moved.

registry

The old registry directory.

registry_new

The new registry directory.

home_dir_new

The new home directory.

pkg

A character string with the name of a single package; if NULL (default), the temporary registry and data directory is returned.

Value

A path to a (registry or data) directory, or NULL, if package does not exist or is not a package including a corpus.

Details

The registry_move is an auxiliary function to create a copy of a registry file in the directory specified by the argument registry_new.

Upon loading the polmineR package, there is a check whether the environment variable CORPUS_REGISTRY is defined. In case it is, the registry files in the directory defined by the CORPUS_REGISTRY environment variable are copied to the temporary registry directory, which serves as the central place to store all registry files for all corpora, be it system corpora, corpora included in R packages, or temporary corpora.

The Corpus Workbench may have problems to cope with a registry path that includes registry non-ASCII characters. On Windows, a call to utils::shortPathName will generate the short MS-DOS path name that circumvents resulting problems.

Usage of the temporary registry directory can be suppress by setting the environment variable POLMINER_USE_TMP_REGISTRY as 'false'. In this case, the registry function will return the environment variable CORPUS_REGISTRY unchanged. The data_dir function will return the "indexed_corpus" directory that is assumed to live in the same parent directory as the registry directory.

Examples

registry() # returns temporary registry directory
#> [1] "/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/Rtmp2XqUSi/polmineR_registry"
registry(pkg = "polmineR") # returns registry directory in polmineR-package
#> [1] "/Users/runner/work/_temp/Library/polmineR/extdata/cwb/registry"
data_dir()
#> [1] "/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/Rtmp2XqUSi/polmineR_data_dir"
data_dir(pkg = "polmineR")
#> [1] "/Users/runner/work/_temp/Library/polmineR/extdata/cwb/indexed_corpora"