Reputation: 269
I have a list
reqr:
chr [1:3] "interpersonal" "communication" "communication and interpersonal"
chr [1:2] "team player" "initiative"
chr [1:2] "mechanical engineering" "written"
How do I split up strings that contain "and", such that
reqr:
chr [1:3] "interpersonal" "communication" "communication" "and" "interpersonal"
chr [1:2] "team player" "initiative"
chr [1:2] "mechanical engineering" "written"
After which, I ensure every string in each element in unique, such that
reqr:
chr [1:3] "interpersonal" "communication" "and" "interpersonal"
chr [1:2] "team player" "initiative"
chr [1:2] "mechanical engineering" "written"
Upvotes: 0
Views: 2070
Reputation: 43334
Hadley's purrr
package can make working with lists less annoying:
library(purrr)
# split each item .x where there's a space with "and" before or after
reqr %>% map(~strsplit(.x, ' (?=and)|(?<=and) ', perl = TRUE)) %>% # alternate form: `map(strsplit, split = ' (?=and)|(?<=and) ', perl = TRUE)`
map(compose(unique, unlist)) # equivalent to `map(unlist) %>% map(unique)` or `simplify_all() %>% map(unique)`
# [[1]]
# [1] "interpersonal" "communication" "and"
#
# [[2]]
# [1] "team player" "initiative"
#
# [[3]]
# [1] "mechanical engineering" "written"
reqr <- list(c("interpersonal", "communication", "communication and interpersonal"),
c("team player", "initiative"),
c("mechanical engineering", "written"))
Upvotes: 3
Reputation: 887078
We can also do this with scan
and gsub
lapply(reqr, function(x) unique(scan(text=gsub(" (and) ", ",\\1,", x),
what = "", sep=",", quiet=TRUE)))
#[[1]]
#[1] "interpersonal" "communication" "and"
#[[2]]
#[1] "team player" "initiative"
#[[3]]
#[1] "mechanical engineering" "written"
NOTE: No external packages used.
Upvotes: 1
Reputation: 214957
You can try this:
lst <- lapply(l, function(vec) unique(unlist(strsplit(vec, "\\s(?=and)|(?<=and)\\s", perl = T))))
str(lst)
# List of 3
# $ : chr [1:3] "interpersonal" "communication" "and"
# $ : chr [1:2] "team player" "initiative"
# $ : chr [1:2] "mechanical engineering" "written"
Upvotes: 3