Sebastian
Sebastian

Reputation: 2570

R - Extracting number from string with regular expression

I want to extract a number with decimals from a string with one single expression if possible.

For example transform "2,123.02" to "2123.02" - my current solution is:

paste(unlist(str_extract_all("2,123.02","\\(?[0-9.]+\\)?",simplify=F)),collapse="")

But what I'm looking for is the expression in str_extract_all to just bind it together as a vector by themself. Is this possible to achieve with an regular expression?

Upvotes: 0

Views: 264

Answers (2)

Cath
Cath

Reputation: 24074

You can try replacing the comma by an empty string:

gsub(",", "", "2,123.02")
#[1] "2123.02"

NB: If you need to replace only commas in between numbers, you can use lookarounds:

gsub("(?<=[0-9]),(?=[0-9])", "", "this, this is my number 2,123.02", perl=TRUE)
#[1] "this, this is my number 2123.02"

I edited with sub instead of gsub in case you have strings with more than one number with a comma. In case you only have one, sub is "sufficient".

NB2: You can call str_extrac_all on the result from gsub, e.g.:

str_extract_all(gsub("(?<=[0-9]),(?=[0-9])", "","first number: 2,123.02, second number: 3,456", perl=T), "\\d+\\.*\\d*", simplify=F)
#[[1]]
#[1] "2123.02" "3456"   

Upvotes: 6

Matthew Plourde
Matthew Plourde

Reputation: 44614

Another option is extract_numeric in the tidyr package.

library(tidyr)
extract_numeric("2,123.02")

[1] 2123.02

Upvotes: 2

Related Questions