NEWS.md
germaparl_download_lda()
will check md5 sums now when downloading data.germaparl_download_lda()
fails, you will now see an informative message and the return value will be FALSE
.germaparl_by_lp
and germaparl_by_year
were included as data.table
objects, making the presence of the data.table
package necessary. To reduce the number of packages imported from and to avoid an error that emerged on Windows, these tables are included as data.frame
objects.germaparl_by_lp
and germaparl_by_year
now includes an explanation of what is reported in rows and columns.germaparl_by_year
table now includes a column unknown_total
and unknown_share
with the total number of tokens that cannot be lemmatized, and their share, respectively. On this basis, an error in the calculation of the aggregate unknown share for all years can be corrected.germaparl_encode_speeches()
and germaparl_encode_lda_topics()
have been moved to the (GitHub-only) polmineR.misc package. These are higher-level functions that rely on polmineR classes and methods. Keeping them in the GermaParl package would require to make polmineR a dependency of GermaParl. But as GermaParl is designed to become a dependency of polmineR, we prevent a circular dependecy by removing the functions. What is more, both functions have been designed to augmment GermaParl, but their essence is morge generic. In the long run, a cwbtools.misc package (to be created) might be the most logical place for generic functionality to augment corpora.sample
that defaults to FALSE
. If set as TRUE
, functionality to retrieve information from the corpus or to modify the corpus will be applied to the smaller GERMAPARLSAMPLE corpus rather than the GERMAPARL corpus.germaparl_download_corpus()
to download the corpus has been moved to cwbtools (v0.2.0). The germaparl_download_corpus()
function is now a convenience wrapper for cwbtools::corpus_install()
that ensures that the correct DOI (argument doi
) is passed to corpus_install()
.germapar_download_corpus()
function has been reworked accordingly. It now takes the argument doi
.GermaParl
R6 class has been dropped from the package. The main method of the function ($summary()
) is superfluous as the size()
-method of the polmineR package produces the same output (data.table
with report of sizes of subcorpora on according to an s_attribute
).germaparl_add_p_attribute_stem()
has been removed from the package. The functionality (adding a new p-attribute with word stems) makes sense, but the implementation should be generic and included in the cwbtools package.germaparl_regdir()
function has been removed. Not necessary any more.germaparl_search_speeches()
function has been removed from the package. The functionality is nice, but there should either be a generic implementation in the polmineR package, or it might be offered as a recipe.germaparl_download_lda
examples.germaparl_load_lda
will now return NULL object (instead of throwing an error) if lda model is not present.germaparl_encode_lda_topics
will now issue a warning (instead of crashing) if the s-attribute ‘speeches’ has not yet been generated.germaparl_download_corpus()
for downloading full corpus.