Rbeginner
Rbeginner

Reputation: 11

Regular expression for numbers in R

I need to write a regular expression to parse the following data:

[1] "Chicken (30.67%);Duck (17.3%);Wild duck (16%);Pigeon (4%);
[2] "Chicken (30.67%);Duck (17.3%);Wild duck (16%);Blue-winged teal (4%)

This is what I have:

"(\\w[\\w\\s]+)\\(([0-9]+\\.[0-9][0-9]?)%\\);?"

It works but I have a couple of problems:

Can anyone help?

Upvotes: 1

Views: 79

Answers (2)

Kirill
Kirill

Reputation: 391

This should help:

   library(stringr)
    str_extract_all(text, pattern = "[0-9]{1,2}(\\.[0-9]{1,2})?%")

Explanation of the regex:

[0-9]{1,2} there are one or two digits between 0-9
  (        start Group
    \\.    a dot (have to escape it with double backslash, otherwise special character
    [0-9]{1,2} there are one or two digits between 0-9
  )?       end group, group may exists, but must not
 %         percent dign 

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626754

Just use my suggested regex:

(\\w+(?:\\s+\\w+)*)\\s*\\(([0-9]+(?:\\.[0-9]+)?)%\\);?

See demo

And the R code sample:

library(stringr)
str_extract_all(str, "(\\w+(?:\\s+\\w+)*)\\s*\\(([0-9]+(?:\\.[0-9]+)?)%\\);?")[[1]]

Upvotes: 1

Related Questions