These function select or discard elements from a character object. For
convenience, the functions char_remove
and char_keep
are defined as
shortcuts for char_select(x, pattern, selection = "remove")
and
char_select(x, pattern, selection = "keep")
, respectively.
These functions make it easy to change, for instance, stopwords based on pattern matching.
char_select( x, pattern, selection = c("keep", "remove"), valuetype = c("glob", "fixed", "regex"), case_insensitive = TRUE ) char_remove(x, ...) char_keep(x, ...)
x | an input character vector |
---|---|
pattern | a character vector, list of character vectors, dictionary, or collocations object. See pattern for details. |
selection | whether to |
valuetype | the type of pattern matching: |
case_insensitive | logical; if |
... | additional arguments passed by |
a modified character vector
# character selection mykeywords <- c("natural", "national", "denatured", "other") char_select(mykeywords, "nat*", valuetype = "glob") #> [1] "natural" "national" char_select(mykeywords, "nat", valuetype = "regex") #> [1] "natural" "national" "denatured" char_select(mykeywords, c("natur*", "other")) #> [1] "natural" "other" char_select(mykeywords, c("natur*", "other"), selection = "remove") #> [1] "national" "denatured" # character removal char_remove(letters[1:5], c("a", "c", "x")) #> [1] "b" "d" "e" words <- c("any", "and", "Anna", "as", "announce", "but") char_remove(words, "an*") #> [1] "as" "but" char_remove(words, "an*", case_insensitive = FALSE) #> [1] "Anna" "as" "but" char_remove(words, "^.n.+$", valuetype = "regex") #> [1] "as" "but" # remove some of the system stopwords stopwords("en", source = "snowball")[1:6] #> [1] "i" "me" "my" "myself" "we" "our" stopwords("en", source = "snowball")[1:6] %>% char_remove(c("me", "my*")) #> [1] "i" "we" "our" # character keep char_keep(letters[1:5], c("a", "c", "x")) #> [1] "a" "c"