Reputation: 11
I need to write a regular expression to parse the following data:
[1] "Chicken (30.67%);Duck (17.3%);Wild duck (16%);Pigeon (4%);
[2] "Chicken (30.67%);Duck (17.3%);Wild duck (16%);Blue-winged teal (4%)
This is what I have:
"(\\w[\\w\\s]+)\\(([0-9]+\\.[0-9][0-9]?)%\\);?"
It works but I have a couple of problems:
Can anyone help?
Upvotes: 1
Views: 79
Reputation: 391
This should help:
library(stringr)
str_extract_all(text, pattern = "[0-9]{1,2}(\\.[0-9]{1,2})?%")
Explanation of the regex:
[0-9]{1,2} there are one or two digits between 0-9
( start Group
\\. a dot (have to escape it with double backslash, otherwise special character
[0-9]{1,2} there are one or two digits between 0-9
)? end group, group may exists, but must not
% percent dign
Upvotes: 1
Reputation: 626754
Just use my suggested regex:
(\\w+(?:\\s+\\w+)*)\\s*\\(([0-9]+(?:\\.[0-9]+)?)%\\);?
See demo
And the R code sample:
library(stringr)
str_extract_all(str, "(\\w+(?:\\s+\\w+)*)\\s*\\(([0-9]+(?:\\.[0-9]+)?)%\\);?")[[1]]
Upvotes: 1