Get concordances for the matches for a query / perform keyword-in-context (kwic) analysis.
kwic(.Object, ...) # S4 method for context kwic( .Object, s_attributes = getOption("polmineR.meta"), cpos = TRUE, verbose = FALSE, ... ) # S4 method for slice kwic( .Object, query, cqp = is.cqp, left = getOption("polmineR.left"), right = getOption("polmineR.right"), s_attributes = getOption("polmineR.meta"), p_attribute = "word", boundary = NULL, cpos = TRUE, stoplist = NULL, positivelist = NULL, regex = FALSE, verbose = TRUE, ... ) # S4 method for partition kwic( .Object, query, cqp = is.cqp, left = getOption("polmineR.left"), right = getOption("polmineR.right"), s_attributes = getOption("polmineR.meta"), p_attribute = "word", boundary = NULL, cpos = TRUE, stoplist = NULL, positivelist = NULL, regex = FALSE, verbose = TRUE, ... ) # S4 method for subcorpus kwic( .Object, query, cqp = is.cqp, left = getOption("polmineR.left"), right = getOption("polmineR.right"), s_attributes = getOption("polmineR.meta"), p_attribute = "word", boundary = NULL, cpos = TRUE, stoplist = NULL, positivelist = NULL, regex = FALSE, verbose = TRUE, ... ) # S4 method for corpus kwic( .Object, query, cqp = is.cqp, check = TRUE, left = as.integer(getOption("polmineR.left")), right = as.integer(getOption("polmineR.right")), s_attributes = getOption("polmineR.meta"), p_attribute = "word", boundary = NULL, cpos = TRUE, stoplist = NULL, positivelist = NULL, regex = FALSE, verbose = TRUE, progress = TRUE, ... ) # S4 method for character kwic( .Object, query, cqp = is.cqp, check = TRUE, left = as.integer(getOption("polmineR.left")), right = as.integer(getOption("polmineR.right")), s_attributes = getOption("polmineR.meta"), p_attribute = "word", boundary = NULL, cpos = TRUE, stoplist = NULL, positivelist = NULL, regex = FALSE, verbose = TRUE, progress = TRUE, ... ) # S4 method for remote_corpus kwic(.Object, ...) # S4 method for remote_partition kwic(.Object, ...) # S4 method for remote_subcorpus kwic(.Object, ...) # S4 method for partition_bundle kwic(.Object, ..., verbose = FALSE) # S4 method for subcorpus_bundle kwic(.Object, ...)
.Object | A (length-one) |
---|---|
... | Further arguments, used to ensure backwards compatibility. If
|
s_attributes | Structural attributes (s-attributes) to include into output table as metainformation. |
cpos | Logical, if |
verbose | A |
query | A query, CQP-syntax can be used. |
cqp | Either a logical value ( |
left | Number of tokens to the left of query match. |
right | Number of tokens to the right of query match. |
p_attribute | The p-attribute, defaults to 'word'. |
boundary | If provided, a length-one character vector stating an s-attribute that will be used to check the boundaries of the text. |
stoplist | Terms or ids to prevent a concordance from occurring in results. |
positivelist | Terms or ids required for a concordance to occurr in results |
regex | Logical, whether |
check | A |
progress | A |
If there are no matches, or if all (initial) matches are dropped due to the
application of a positivelist, a NULL
is returned.
The method works with a whole CWB corpus defined by a character vector, and
can be applied on a partition
- or a context
object.
If a positivelist
is supplied, only those concordances will be kept that
have one of the terms from the positivelist
occurr in the context of
the query match. Use argument regex
if the positivelist should be
interpreted as regular expressions. Tokens from the positivelist will be
highlighted in the output table.
If a negativelist
is supplied, concordances are removed if any of the
tokens of the negativelist
occurrs in the context of the query match.
Applying the kwic
-method on a partition_bundle
or
subcorpus_bundle
will return a single kwic
object that
includes a column 'subcorpus_name' with the name of the subcorpus
(or partition
) in the input object where the match for a concordance
occurs.
Baker, Paul (2006): Using Corpora in Discourse Analysis. London: continuum, pp. 71-93 (ch. 4).
Jockers, Matthew L. (2014): Text Analysis with R for Students of Literature. Cham et al: Springer, pp. 73-87 (chs. 8 & 9).
The return value is a kwic-class
object; the
documentation for the class explains the standard generic methods
applicable to kwic-class
objects. It is possible to read the
whole text where a query match occurs, see the read
-method.
To highlight terms in the context of a query match, see the
highlight
-method.
#>#># basic usage K <- kwic("GERMAPARLMINI", "Integration") if (interactive()) show(K) oil <- corpus("REUTERS") %>% kwic(query = "oil") if (interactive()) show(oil) oil <- corpus("REUTERS") %>% kwic(query = "oil") %>% highlight(yellow = "crude") if (interactive()) show(oil) # increase left and right context and display metadata K <- kwic( "GERMAPARLMINI", "Integration", left = 20, right = 20, s_attributes = c("date", "speaker", "party") ) if (interactive()) show(K) # use CQP syntax for matching K <- kwic( "GERMAPARLMINI", '"Integration" [] "(Menschen|Migrant.*|Personen)"', cqp = TRUE, left = 20, right = 20, s_attributes = c("date", "speaker", "party") ) if (interactive()) show(K) # check that boundary of region is not transgressed K <- kwic( "GERMAPARLMINI", '"Sehr" "geehrte"', cqp = TRUE, left = 100, right = 100, boundary = "date" )#>#>#>#>#> | | | 0% | |==== | 5% | |======= | 10% | |========== | 15% | |============== | 20% | |================== | 25% | |===================== | 30% | |======================== | 35% | |============================ | 40% | |================================ | 45% | |=================================== | 50% | |====================================== | 55% | |========================================== | 60% | |============================================== | 65% | |================================================= | 70% | |==================================================== | 75% | |======================================================== | 80% | |============================================================ | 85% | |=============================================================== | 90% | |================================================================== | 95% | |======================================================================| 100%if (interactive()) show(K) # use positivelist and highlight matches in context K <- kwic("GERMAPARLMINI", query = "Integration", positivelist = "[Ee]urop.*", regex = TRUE)#>#>K <- highlight(K, yellow = "[Ee]urop.*", regex = TRUE) # Apply kwic on partition_bundle/subcorpus_bundle gparl_2009_11_10_speeches <- corpus("GERMAPARLMINI") %>% subset(date == "2009-11-10") %>% as.speeches(s_attribute_name = "speaker", progress = FALSE, verbose = FALSE) k <- kwic(gparl_2009_11_10_speeches, query = "Integration")