Richi W
Richi W

Reputation: 3656

Applying gsub to various columns

What is the most efficient way to apply gsub to various columns? The following does not work

x1=c("10%","20%","30%")
x2=c("60%","50%","40%")
x3 = c(1,2,3)
x = data.frame(x1,x2,x3)
per_col = c(1,2)
x = gsub("%","",x[,per_col])

How can I most efficiently drop the "%" sign in specified columns. Can I apply it to the whole dataframe? This would be useful in the case where I don't know where the percentage columns are.

Upvotes: 17

Views: 35400

Answers (5)

Ronak Shah
Ronak Shah

Reputation: 388982

We can unlist per_col columns, remove "%" symbol and convert it into numeric.

x[per_col] <- as.numeric(gsub("%","", unlist(x[per_col])))
#In this case using sub would be enough too as we have only 1 % symbol to replace
#x[per_col] <- as.numeric(sub("%","", unlist(x[per_col])))

x
#  x1 x2 x3
#1 10 60  1
#2 20 50  2
#3 30 40  3

Upvotes: 2

bathyscapher
bathyscapher

Reputation: 2309

To add on docendo discimus' answer, an extension with non-adjacent columns and returning a data.frame:

x1 <- c("10%", "20%", "30%")
x2 <- c("60%", "50%", "40%")
x3 <- c(1, 2, 3)
x4 <- c("60%", "50%", "40%")

x <- data.frame(x1, x2, x3, x4)

x[, c(1:2, 4)] <- as.data.frame(apply(x[,c(1:2, 4)], 2,
                                         function(x) {
                                           as.numeric(gsub("%", "", x))}
))

> x
  x1 x2 x3 x4
1 10 60  1 60
2 20 50  2 50
3 30 40  3 40

> class(x)
[1] "data.frame"

Upvotes: 1

Simon C.
Simon C.

Reputation: 1067

The first answer works but be careful if you are using data.frame with string: the @docendo discimus's answer will return NAs.

If you want to keep the content of your column as string just remove the as.numeric and convert your table into a data frame after :

as.data.frame(apply(x, 2, function(y) as.numeric(gsub("%", "", y))))
     x1 x2 x3
[1,] 10 60  1
[2,] 20 50  2
[3,] 30 40  3

Upvotes: 3

info_seekeR
info_seekeR

Reputation: 1326

Or, you could try the lapply solution:

as.data.frame(lapply(x, function(y) gsub("%", "", y)))

  x1 x2 x3
1 10 60  1
2 20 50  2
3 30 40  3

Upvotes: 13

talat
talat

Reputation: 70266

You can use apply to apply it to the whole data.frame

apply(x, 2, function(y) as.numeric(gsub("%", "", y)))
     x1 x2 x3
[1,] 10 60  1
[2,] 20 50  2
[3,] 30 40  3

Upvotes: 20

Related Questions