Get hits for query — hits • polmineR

Get hits for queries, optionally with s-attribute values.

hits(.Object, ...)

# S4 method for corpus
hits(
  .Object,
  query,
  cqp = FALSE,
  check = TRUE,
  s_attribute = NULL,
  p_attribute = "word",
  size = FALSE,
  freq = FALSE,
  mc = 1L,
  verbose = TRUE,
  progress = FALSE,
  ...
)

# S4 method for character
hits(
  .Object,
  query,
  cqp = FALSE,
  check = TRUE,
  s_attribute = NULL,
  p_attribute = "word",
  size = FALSE,
  freq = FALSE,
  mc = FALSE,
  verbose = TRUE,
  progress = TRUE,
  ...
)

# S4 method for slice
hits(
  .Object,
  query,
  cqp = FALSE,
  s_attribute = NULL,
  p_attribute = "word",
  size = FALSE,
  freq = FALSE,
  mc = FALSE,
  progress = FALSE,
  verbose = TRUE,
  ...
)

# S4 method for subcorpus
hits(
  .Object,
  query,
  cqp = FALSE,
  s_attribute = NULL,
  p_attribute = "word",
  size = FALSE,
  freq = FALSE,
  mc = FALSE,
  progress = FALSE,
  verbose = TRUE,
  ...
)

# S4 method for partition
hits(
  .Object,
  query,
  cqp = FALSE,
  s_attribute = NULL,
  p_attribute = "word",
  size = FALSE,
  freq = FALSE,
  mc = FALSE,
  progress = FALSE,
  verbose = TRUE,
  ...
)

# S4 method for partition_bundle
hits(
  .Object,
  query,
  cqp = FALSE,
  check = TRUE,
  p_attribute = getOption("polmineR.p_attribute"),
  s_attribute = NULL,
  size = TRUE,
  freq = FALSE,
  mc = getOption("polmineR.mc"),
  progress = FALSE,
  verbose = TRUE,
  ...
)

# S4 method for context
hits(.Object, s_attribute = NULL, verbose = TRUE, ...)

Arguments

.Object	A length-one `character` vector with a corpus ID, a `partition` or `partition_bundle` object
...	Further arguments (used for backwards compatibility).
query	A `character` vector (optionally named, see details) with one or more queries.
cqp	Either a `logical` value (`TRUE` if query is a CQP query), or a function to check whether `query` is a CQP query or not.
check	A `logical` value, whether to check validity of CQP query using `check_cqp_query`.
s_attribute	A `character` vector of s-attributes that will be reported as metadata.
p_attribute	A `character` vector stating a p-attribute.
size	A `logical` value, whether to report the size of subcorpus.
freq	A `logcial` value, whether to report relative frequencies.
mc	A `logical` value, whether to use multicore.
verbose	A `logical` value, whether to output messages.
progress	A `logical` value, whether to show progress bar.

Details

If the character vector provided by query is named, these names will be reported in the data.table that is returned rather than the queries.

If freq is TRUE, the data.table returned in the DT-slot will deliberately include the subsets of the partition/corpus with no hits (query is NA, count is 0).

Examples

use("polmineR")
#> ... activating corpus: GERMAPARLMINI (version: 0.0.1 | build date: 2019-02-23)
#> ... activating corpus: REUTERS

# get hits for corpus object
y <- corpus("REUTERS") %>% hits(query = "oil")
y <- corpus("REUTERS") %>% hits(query = c("oil", "barrel"))
y <- corpus("REUTERS") %>% hits(query = "oil", s_attribute = "places", freq = TRUE)
#> ... getting sizes
#> ... frequencies

# specify corpus by corpus ID
y <- hits("REUTERS", query = "oil")
y <- hits("REUTERS", query = "oil", s_attribute = "places", freq = TRUE)
#> ... getting sizes
#> ... frequencies

# get hits for partition
p <- partition("REUTERS", places = "saudi-arabia", regex = TRUE)
#> ... get encoding: latin1
#> ... get cpos and strucs
y <- hits(p, query = "oil")
y <- hits(p, query = "oil", s_attribute = "id")

# get hits for subcorpus
y <- corpus("REUTERS") %>%
  subset(grep("saudi-arabia", places)) %>%
  hits(query = "oil")