Justin Reid
Justin Reid

Reputation: 119

How to remove $ and % from columns in R?

I am trying to follow along in a tutorial on ggplot but the data set I have list dollar values with $ and percent values with % making plotting impossible as it says that it must be numeric.

for example my datasets name is housing and column with the prices of homes is labeled Home.Value the prices are formatted: $24,895 $25,175

How would I go about removing the dollar sign and the percent sign?

Upvotes: 2

Views: 2864

Answers (2)

Greg Snow
Greg Snow

Reputation: 49650

This answer shows a method for removing comas when reading the data into R. It can be modified easily to also remove $, %, and other things as well (just change gsub(",","", from) to gsub("[,$%]","", from)).

Upvotes: 0

Matias Andina
Matias Andina

Reputation: 4230

Suppose you have a data frame like this one:

df<-data.frame(A=c("$5,33","$3,55"),B=c(T,F))

Then you could replace column A with

df$A<-gsub("\\$","",df$A)

You have to use \ or fixed=T for gsub to understand that $ (or %) are what you want to get replaced.

If you want one line for $ and % you can use "OR" opperator (|)

df$A<-gsub("\\$|%","",df$A)

UPDATE:

Maybe you want it that way but take into account that your numbers are formatted with commas and will stay as characters for R. You're probably going to substitute the comma later.

To do that we have to get rid of the commas using the expression "\," (again we must escape the comas with \)

df$A<-as.numeric(gsub("\\,","",df$A))

df
    A     B
1 533  TRUE
2 355 FALSE

Notice now, A column is numeric

str(df)
'data.frame':   2 obs. of  2 variables:
 $ A: num  533 355
 $ B: logi  TRUE FALSE

Again, you could have done everything with one line but I'm guessing it would be more easy for you in two lines.

Upvotes: 4

Related Questions