Get all regions defined by a structural attribute. Unlike
get_region_matrix()
that returns a region matrix for a defined subset of
strucs, all regions are returned. As it is the fastest option, the function
reads the binary *.rng file for the structural attribute directly. The corpus
library (CL) is not used in this case.
s_attr_regions(
corpus,
s_attr,
registry = Sys.getenv("CORPUS_REGISTRY"),
data_dir = corpus_data_dir(corpus = corpus, registry = registry)
)
A length-one character
vector with a corpus ID.
A length-one character
vector stating a structural attribute.
A length-one character
vector stating the registry
directory (defaults to CORPUS_REGISTRY environment variable).
The data directory of the corpus.
A two-colum matrix
with the regions defined by the structural
attribute: Column 1 defines left corpus positions and column 2 right corpus
positions of regions.
s_attr_regions("REUTERS", s_attr = "id", registry = get_tmp_registry())
#> [,1] [,2]
#> [1,] 0 91
#> [2,] 92 535
#> [3,] 536 590
#> [4,] 591 659
#> [5,] 660 752
#> [6,] 753 1217
#> [7,] 1218 1651
#> [8,] 1652 1815
#> [9,] 1816 2146
#> [10,] 2147 2495
#> [11,] 2496 2873
#> [12,] 2874 2965
#> [13,] 2966 3070
#> [14,] 3071 3173
#> [15,] 3174 3283
#> [16,] 3284 3432
#> [17,] 3433 3631
#> [18,] 3632 3714
#> [19,] 3715 3996
#> [20,] 3997 4049