Reputation: 1250
I have downloaded some data from a web server, including prices that are formatted for humans, including $ and thousand separators.
> head(m)
[1] $129,900 $139,900 $254,000 $260,000 $290,000 $295,000
I was able to get rid of the commas, using
m <- sub(',','',m)
but
m <- sub('$','',m)
does not remove the dollar sign. If I try mn <- as.numeric(m)
or as.integer I get an error message:
Warning message: NAs introduced by coercion
and the result is:
> head(m)
[1] NA NA NA NA NA NA
How can I remove the $ sign? Thanks
Upvotes: 3
Views: 7463
Reputation: 109894
you could also use:
x <- c("$129,900", "$139,900", "$254,000", "$260,000", "$290,000", "$295,000")
library(qdap)
as.numeric(mgsub(c("$", ","), "", x))
yielding:
> as.numeric(mgsub(c("$", ","), "", x))
[1] 129900 139900 254000 260000 290000 295000
If you wanted to stay in base use the fixed = TRUE
argument to gsub:
x <- c("$129,900", "$139,900", "$254,000", "$260,000", "$290,000", "$295,000")
as.numeric(gsub("$", "", gsub(",", "", x), fixed = TRUE))
Upvotes: 3
Reputation: 269824
Try this. It means replace anything that is not a digit with the empty string:
as.numeric(gsub("\\D", "", dat))
or to remove anything that is neither a digit nor a decimal:
as.numeric(gsub("[^0-9.]", "", dat))
UPDATE: Added a second similar approach in case the data in the question is not representative.
Upvotes: 6
Reputation: 121588
dat <- gsub('[$]','',dat)
dat <- as.numeric(gsub(',','',dat))
> dat
[1] 129900 139900 254000 260000 290000 295000
In one step
gsub('[$]([0-9]+)[,]([0-9]+)','\\1\\2',dat)
[1] "129900" "139900" "254000" "260000" "290000" "295000"
Upvotes: 8