Agi
Agi

Reputation: 83

Subtracting columns of data frame by name

Let's assume I have a data frame as bellow:

df <- as.data.frame(matrix(seq(1,20,1),nrow=4), byrow=TRUE)
colnames(df) <- c("X1","X2","X3","X4","X5")
rownames(df) <- as.Date(c("2020-01-02","2020-01-03","2020-01-04","2020-01-05"))

df
           X1 X2 X3 X4 X5
2020-01-02  1  2  3  4  5
2020-01-03  6  7  8  9 10
2020-01-04 11 12 13 14 15
2020-01-05 16 17 18 19 20

I want to subtract all columns from the first column X1 and store it in the same column. I have tried doing

  for(i in colnames(df)){
    df[i] <- lapply(df[i], function(x) x-df["X1"])
  }

But it only applies it to the first column. How can I run it for all the columns?

Upvotes: 1

Views: 1088

Answers (3)

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a way with grep:

i_col <- grep("X1", names(df))
df[] <- df - df[, i_col]
df
#           X1 X2 X3 X4 X5
#2020-01-02  0  4  8 12 16
#2020-01-03  0  4  8 12 16
#2020-01-04  0  4  8 12 16
#2020-01-05  0  4  8 12 16

And another, with grep/sweep. In fact, - is sweep's default function.

sweep(df, 1, df[[i_col]], check.margin = FALSE)
#           X1 X2 X3 X4 X5
#2020-01-02  0  4  8 12 16
#2020-01-03  0  4  8 12 16
#2020-01-04  0  4  8 12 16
#2020-01-05  0  4  8 12 16

Upvotes: 0

jay.sf
jay.sf

Reputation: 72633

If you want to stick to lapply you may do it like so:

df[] <- lapply(df, `-`, df$X1)
df
#            X1 X2 X3 X4 X5
# 2020-01-02  0  4  8 12 16
# 2020-01-03  0  4  8 12 16
# 2020-01-04  0  4  8 12 16
# 2020-01-05  0  4  8 12 16

Upvotes: 1

Duck
Duck

Reputation: 39595

Try this base R solution without loop. Just have in mind the position of columns:

#Data
df <- as.data.frame(matrix(seq(1,20,1),nrow=4), byrow=TRUE)
colnames(df) <- c("X1","X2","X3","X4","X5")
rownames(df) <- as.Date(c("2020-01-02","2020-01-03","2020-01-04","2020-01-05"))
#Set columns for difference
df[,2:5] <- df[,2:5]-df[,1]

Output:

           X1 X2 X3 X4 X5
2020-01-02  1  4  8 12 16
2020-01-03  2  4  8 12 16
2020-01-04  3  4  8 12 16
2020-01-05  4  4  8 12 16

Or a more sophisticated way would be:

#Create index
#Var to substract
i1 <- which(names(df)=='X1')
#Vars to be substracted with X1
i2 <- which(names(df)!='X1')
#Compute
df[,i2]<-df[,i2]-df[,i1]

Output:

           X1 X2 X3 X4 X5
2020-01-02  1  4  8 12 16
2020-01-03  2  4  8 12 16
2020-01-04  3  4  8 12 16
2020-01-05  4  4  8 12 16

Upvotes: 2

Related Questions