detect noise
noise(.Object, ...) # S4 method for DocumentTermMatrix noise( .Object, minTotal = 2, minTfIdfMean = 0.005, sparse = 0.995, stopwordsLanguage = "german", minNchar = 2, specialChars = getOption("polmineR.specialChars"), numbers = "^[0-9\\.,]+$", verbose = TRUE ) # S4 method for TermDocumentMatrix noise(.Object, ...) # S4 method for character noise( .Object, stopwordsLanguage = "german", minNchar = 2, specialChars = getOption("polmineR.specialChars"), numbers = "^[0-9\\.,]+$", verbose = TRUE ) # S4 method for textstat noise(.Object, p_attribute, ...)
| .Object | an .Object of class |
|---|---|
| ... | further parameters |
| minTotal | minimum colsum (for DocumentTermMatrix) to qualify a term as non-noise |
| minTfIdfMean | minimum mean value for tf-idf to qualify a term as non-noise |
| sparse | will be passed into |
| stopwordsLanguage | e.g. "german", to get stopwords defined in the tm package |
| minNchar | min char length ti qualify a term as non-noise |
| specialChars | special characters to drop |
| numbers | regex, to drop numbers |
| verbose | logical |
| p_attribute | relevant if applied to a textstat object |
a list