The function returns a character
vector with characters sets (charsets)
supported by the Corpus Workbench (CWB). The vector is derived from the the
CorpusCharset
object defined in the header file of the corpus library (CL).
cwb_charsets()
Early versions of the CWB were developed for "latin1", "utf8" support has been introduced with CWB v3.2. Note that RcppCWB is tested only for "latin1" and "utf8" and that R uses "UTF-8" rather than utf8" (CWB) by convention.
cwb_charsets()
#> [1] "ascii" "latin1" "latin2" "latin3" "latin4" "cyrillic"
#> [7] "arabic" "greek" "hebrew" "latin5" "latin6" "latin7"
#> [13] "latin8" "latin9" "utf8"