New RcppCWB version v0.2.5 released.

The new RcppCWB version v0.2.5 is a maintanence release with an update of Windows configuration scripts to meet requirements of an upcoming version of Rtools. A big thanks to Jeroen Ooms for his great support!

On this occassion, some more words on the package: We have not yet announced the initial release of the RcppCWB package at this blog! The package name is awkward and technical at the same time. Yet it is telling what the package is about: It is a wrapper for the Corpus Workbench (CWB) using the Rcpp package.

The claim of Rcpp is that it offers “a seamless integration of R and C++”. This is perfectly true. Still, developing RcppCWB has been a fairly demanding package-building exercise, because it was my first leap into the world of C and C++. I am very happy that it worked out and that we have a package now exposing CWB functionality from R.

Originally, the intention was to simply have a portable follow-up, installable on Windows, to the pioneering rcqp package. The rcqp package was perfect for Linux and macOS users, but due to its design, it would not compile on Windows. In June 2018, rcqp was moved to the CRAN archives. Due to the importance of an CWB interface for the polmineR package, I am extremely happy that I had started to invest time and energy in the RcppCWB package earlier on.

Why do I keep relying on the CWB? The CWB is a classical tool for corpus analysis which has inspired a set of developments. A persisting advantage of the CWB is its mature, open source code base that is actively maintained by a community of developers. It is used as a robust and efficient backend for widely used tools such as TXM or CQPweb. Its uncompromising C implementation guarantees speed and makes it well suited to be integrated with R at the same time.

RcppCWB makes CWB functionality accessible in two steps. First, static libraries for the genuine CWB code are compiled for Linux and macOS; for building RcppCWB on Windows, cross-compiled binaries of these libraries are downloaded from the PolMine presence at GitHub. Second, a set of C++/RcppCWB wrapper functions expose the CWB functionality. Working with these two layers keeps the code that has to be maintained manageable, and makes the whole package portable. So this pushes the door open for the Windows community of potential polmineR users: You can simply download RcppCWB binaries from CRAN: No complicated installation, very easy indeed.

If you want to learn more, please consult the README explaining the RcppCWB package at GitHub. The package manual offers an introduction, see the pdf manual at CRAN, or the GitHub pages website of RcppCWB.

Using RcppCWB for a couple of months now in our project, we think it is stable. However, if you have any kind of feedback, please use GitHub issues, or get in touch directly.