Reputation:
I have strings with different numbers of underscores. I am trying to select the strings that contain two or more underscores. Any suggestions?
Strings <- c("aa_bb", "aa_bb_cc_dd", "jah_ghtfv_jal")
Currently I have:
Match1 <- Strings[grepl("[_].+[_]", Strings) == TRUE,] and
Match2 <- Strings[grepl("_.*_", Strings) == TRUE,]
Both return slightly different counts. Can anyone come up with a better way to count to return strings that have two or more underscores?
In this case I would like to return "aa_bb_cc_dd, "jah_ghtfv_jal".
Thanks!
Upvotes: 0
Views: 312
Reputation: 33792
If the strings can take any form and the underscores can appear anywhere (including, for example, just two underscores "__"), you could just count them using stringr::str_count
:
library(stringr)
Strings[str_count(Strings, "_") > 1]
Upvotes: 2
Reputation: 522151
Your current use of grepl
is incorrect, and you should be using this:
Match1 <- Strings[grepl("[_].+[_]", Strings)]
Match2 <- Strings[grepl("_.*_", Strings)]
Both of these return identical results, in line with what you expect. But, I think what you really here is:
Strings <- c("aa_bb", "aa_bb_cc_dd", "jah_ghtfv_jal")
Strings[grepl("_[^_]+_", Strings)]
[1] "aa_bb_cc_dd" "jah_ghtfv_jal"
This matches any string which has an underscore, followed by one or more non underscore characters, followed by a second underscore.
Upvotes: 0