Reputation: 31
Say we have:
TestStrings <- c("Some number < 100", "Some number > 999", "Some number $1000", "Some number 1000000")
I want to replace all numbers with a space except numbers following the substrings:
"< \\d+" "> \\d+" "$\\d+"
What Regular expression could I write in function gsub()
to complete such a task.
I know the follow code is wrong but here is what I have.
gsub(pattern = "^> \\d+|^< \\d+|^$\\d+", replace = " ", TestStrings)
Upvotes: 2
Views: 759
Reputation: 29238
We can use the following pattern:
[a-z]\s*\K\d+
Here's a Regex Demo.
In r it would be:
gsub("[a-z]\\s*\\K\\d+", "", TestStrings, perl = T)
# [1] "Some number < 100" "Some number > 999"
# [3] "Some number $1000" "Some number "
Upvotes: 2
Reputation: 1255
What about this:
gsub("[<>\\$] ?\\d+", " ", TestStrings)
It returns:
[1] "Some number " "Some number " "Some number " "Some number 1000000"
which I think is what you are looking for.
EDIT Actually you want the opposite, so
gsub("([<>\\$] ?\\d+)|\\d+", "\\1", TestStrings)
[1] "Some number < 100" "Some number > 999" "Some number $1000" "Some number "
Upvotes: 0
Reputation: 887971
Perhaps this helps
gsub("[<>] \\d+(*SKIP)(*FAIL)|\\d+", " ", TestStrings, perl = TRUE)
#[1] "Some number < 100" "Some number > 999" "Some number $ " "Some number "
If we don't need the $
gsub("[<>] \\d+(*SKIP)(*FAIL)|\\$*\\d+", " ", TestStrings, perl = TRUE)
#[1] "Some number < 100" "Some number > 999" "Some number " "Some number "
If we need the $
and the numbers
gsub("([<>] |\\$)\\d+(*SKIP)(*FAIL)|\\d+", " ", TestStrings, perl = TRUE)
#[1] "Some number < 100" "Some number > 999" "Some number $1000" "Some number "
Upvotes: 1