Reputation: 67
I am trying to remove the words "Arts and Humanities" and "Social Sciences" from a string containing concatenated by "/" different disciplines of knowledge as follows:
string = "Arts and Humanities Other Topics/Social Sciences Other Topics/Arts and Humanities/Social Sciences/Sociology"
I have tried this using stringr
package:
sapply(strsplit(string, "/"), function(x) paste(str_remove(x, "\\bArts and Humanities\\b|\\bSocial Sciences\\b"), collapse = "/"))
But the output generated is " Other Topics/ Other Topics///Sociology"
and I need an output like this:
"Arts and Humanities Other Topics/Social Sciences Other Topics/Sociology"
Thanks in advance.
Upvotes: 0
Views: 413
Reputation: 3326
Just needs a little tweaking, and now strings
can be generalized to a vector of such strings:
sapply(
# Split each string by "/" into its components.
X = strsplit(x = strings, split = "/"),
# Remove undesired components and then reassemble the strings.
FUN = function(v){paste0(
# Use subscripting to filter out matches.
v[!grepl(x = v, pattern = "^\\s*(Arts and Humanities|Social Sciences)\\s*$")],
# Reassemble components as separated by "/".
collapse = "/"
)},
# Make the result a vector like the original 'string' (rather than a list).
simplify = TRUE,
USE.NAMES = FALSE
)
Given a vector of strings
like this
strings <- c(
"Arts and Humanities Other Topics/Social Sciences Other Topics/Arts and Humanities/Social Sciences/Sociology",
"Sociology/Arts and Humanities"
)
this solution should yield the following result:
[1] "Arts and Humanities Other Topics/Social Sciences Other Topics/Sociology"
[2] "Sociology"
A solution that uses unlist()
will collapse everything into a single, giant string, rather than reassembling each string in strings
.
Upvotes: 1
Reputation: 3269
One way would be separate the whole string and then exclude that part that you are not interested in:
paste0(unlist(strsplit(string, '/'))[!unlist(strsplit(string, '/')) %in% c("Arts and Humanities", "Social Sciences")],
collapse = '/')
or
paste0(base::setdiff(unlist(strsplit(string, '/')),
c("Arts and Humanities", "Social Sciences")), collapse = '/')
#"Arts and Humanities Other Topics/Social Science Other Topics/Sociology"
Upvotes: 1