Reputation: 170
I would like to keep all occurrences which end only with a specific letter (lets say "a") from a string. I am not dealing with a vector composed of different elements but rather a string whose occurrences are separated by spaces.
Here are the data:
have="5a 4a 8a 10a 3a 5m 10m 7a 8p 11s 5s 4h 24h"
want="5a 4a 8a 10a 3a 7a"
Here are some codes:
gsub("([A-Z]|[0-9])([m|p|h|s])","", have)
gsub("\\w+m|p|h|s *", "", have)
After applying one of these gsub codes, I get 2 types of info (alphanumeric followed by "a" and purely figures).
But I still need to clean so I get only the occurrences ending with "a". Whould you have any idea?
Upvotes: 1
Views: 114
Reputation: 163517
You might use this pattern and replace with an empty string:
[ ]?[a-z0-9]+[mphs]
[ ]?
Optional space (the square brackets are only for clarity)[a-z0-9]+
Character class, match 1+ times a-z 0-9[mphs]
Character class, match m, p h or sFor example
have="5a 4a 8a 10a 3a 5m 10m 7a 8p 11s 5s 4h 24h"
gsub(" ?[a-z0-9]+[mphs]","", have)
Result
[1] "5a 4a 8a 10a 3a 7a"
Perhaps you could match them instead:
\b\da\b
\b
Word boundary\d
a digita
match a\b
Word boundaryNote that in the character class [m|p|h|s]
the |
does not mean or
but a |
char and can also be written as [mphs|]
.
Upvotes: 2
Reputation: 13319
You can do:
trimws(gsub("([A-Z]|[0-9]{1,})([b-z])","",have))
[1] "5a 4a 8a 10a 3a 7a"
To remove the extra space:
gsub("\\s{2,}"," ",
trimws(gsub("([A-Z]|[0-9]{1,})([b-z])","",have)))
#[1] "5a 4a 8a 10a 3a 7a"
Upvotes: 3
Reputation: 3786
Or, much longer but easier regexp, turn it into a vector and then turn it back to a string.
have_string <- "5a 4a 8a 10a 3a 5m 10m 7a 8p 11s 5s 4h 24h"
have_vector <- unlist(strsplit(have_string," "))
library(stringr)
want_vector <- have_vector[str_detect(have_vector, ".*?a$")]
want_string <- paste(want_vector, sep = " ", collapse = " ")
Upvotes: 2
Reputation: 37661
You can split it into words, use grep
to identify the words ending in a, then paste them back together.
Words = strsplit(have, "\\W+")[[1]]
paste(grep("a$", Words, value=T), collapse=" ")
[1] "5a 4a 8a 10a 3a 7a"
Upvotes: 2