The structural attributes of a corpus (s-attributes) can be used
to generate subcorpora (i.e. a subcorpus
class object) by applying
the subset
-method. To obtain a subcorpus
, the
subset
-method can be applied on a corpus represented by a
corpus
object, a length-one character
vector (as a shortcut),
and on a subcorpus
object.
# S4 method for corpus subset(x, subset, regex = FALSE, ...) # S4 method for character subset(x, ...) # S4 method for subcorpus subset(x, subset, ...) # S4 method for remote_corpus subset(x, subset)
x | A |
---|---|
subset | A |
regex | A |
... | An expression that will be used to create a subcorpus from s-attributes. |
The methods applicable for the subcorpus
object resulting
from subsetting a corpus or subcorpus are described in the documentation of
the subcorpus-class
. Note that the subset
-method can also be
applied to textstat-class
objects (and objects inheriting from
this class).
#>#># examples for standard and non-standard evaluation a <- corpus("GERMAPARLMINI") # subsetting a corpus object using non-standard evaluation sc <- subset(a, speaker == "Angela Dorothea Merkel") sc <- subset(a, speaker == "Angela Dorothea Merkel" & date == "2009-10-28") sc <- subset(a, grepl("Merkel", speaker)) sc <- subset(a, grepl("Merkel", speaker) & date == "2009-10-28") # subsetting corpus specified by character vector sc <- subset("GERMAPARLMINI", grepl("Merkel", speaker)) sc <- subset("GERMAPARLMINI", speaker == "Angela Dorothea Merkel") sc <- subset("GERMAPARLMINI", speaker == "Angela Dorothea Merkel" & date == "2009-10-28") sc <- subset("GERMAPARLMINI", grepl("Merkel", speaker) & date == "2009-10-28") # subsetting a corpus using the (old) logic of the partition-method sc <- subset(a, speaker = "Angela Dorothea Merkel") sc <- subset(a, speaker = "Angela Dorothea Merkel", date = "2009-10-28") sc <- subset(a, speaker = "Merkel", regex = TRUE) sc <- subset(a, speaker = c("Merkel", "Kauder"), regex = TRUE) sc <- subset(a, speaker = "Merkel", date = "2009-10-28", regex = TRUE) # providing the value for s-attribute as a variable who <- "Volker Kauder" sc <- subset(a, quote(speaker == who))#> Error in eval(stub[[3L]], x, enclos): object 'who' not found# use bquote for quasiquotation when using a variable for subsetting in a loop for (who in c("Angela Dorothea Merkel", "Volker Kauder", "Ronald Pofalla")){ sc <- subset(a, bquote(speaker == .(who))) if (interactive()) print(size(sc)) }#> Error in .checkTypos(e, names_x): Object 'who' not found amongst struc, cpos_left, cpos_right, speaker# equivalent procedure with lapply (DOES NOT WORK YET) b <- lapply( c("Angela Dorothea Merkel", "Volker Kauder", "Ronald Pofalla"), function(who) subset(a, bquote(speaker == .(who))) )#> Error in .checkTypos(e, names_x): Object 'who' not found amongst struc, cpos_left, cpos_right, speaker#> Error in lapply(X = X, FUN = FUN, ...): object 'b' not found