Get matrix with moving windows. Negative integer values indicate absence of a token at the respective position.

get_cbow_matrix(
  corpus,
  p_attribute,
  registry = Sys.getenv("CORPUS_REGISTRY"),
  matrix,
  window
)

Arguments

corpus

a CWB corpus

p_attribute

a positional attribute

registry

the registry directory

matrix

a matrix

window

window size

Examples

m <- get_region_matrix(
  corpus = "REUTERS", s_attribute = "places",
  strucs = 0L:5L, registry = get_tmp_registry()
  )
windowsize <- 3L
m2 <- get_cbow_matrix(
  corpus = "REUTERS", p_attribute = "word",
  registry = get_tmp_registry(), matrix = m, window = windowsize
  )
colnames(m2) <- c(-windowsize:-1, "node", 1:windowsize)