K Owen
K Owen

Reputation: 1250

R: removing the '$' symbols

I have downloaded some data from a web server, including prices that are formatted for humans, including $ and thousand separators.

> head(m)
[1] $129,900 $139,900 $254,000 $260,000 $290,000 $295,000

I was able to get rid of the commas, using

m <- sub(',','',m)

but

m <- sub('$','',m)

does not remove the dollar sign. If I try mn <- as.numeric(m) or as.integer I get an error message:

Warning message: NAs introduced by coercion

and the result is:

> head(m)
[1] NA NA NA NA NA NA

How can I remove the $ sign? Thanks

Upvotes: 3

Views: 7463

Answers (3)

Tyler Rinker
Tyler Rinker

Reputation: 109894

you could also use:

x <- c("$129,900", "$139,900", "$254,000", "$260,000", "$290,000", "$295,000")

library(qdap)
as.numeric(mgsub(c("$", ","), "", x))

yielding:

> as.numeric(mgsub(c("$", ","), "", x))
[1] 129900 139900 254000 260000 290000 295000

If you wanted to stay in base use the fixed = TRUE argument to gsub:

x <- c("$129,900", "$139,900", "$254,000", "$260,000", "$290,000", "$295,000")
as.numeric(gsub("$", "", gsub(",", "", x), fixed = TRUE))

Upvotes: 3

G. Grothendieck
G. Grothendieck

Reputation: 269824

Try this. It means replace anything that is not a digit with the empty string:

as.numeric(gsub("\\D", "", dat))

or to remove anything that is neither a digit nor a decimal:

as.numeric(gsub("[^0-9.]", "", dat))

UPDATE: Added a second similar approach in case the data in the question is not representative.

Upvotes: 6

agstudy
agstudy

Reputation: 121588

 dat <- gsub('[$]','',dat)
 dat <- as.numeric(gsub(',','',dat))
 > dat
 [1] 129900 139900 254000 260000 290000 295000

In one step

 gsub('[$]([0-9]+)[,]([0-9]+)','\\1\\2',dat)
[1] "129900" "139900" "254000" "260000" "290000" "295000"

Upvotes: 8

Related Questions