zoe
zoe

Reputation: 311

Subtract even columns from odd ones in a data frame

I have a dataframe (in reality I have 170 columns (85 pairs) and ~8000 rows)

data <- data.frame(A = c(6,5,4,3), B = c(2,2,2,2), C = c(9,8,7,6), D = c(2,2,2,2))

I would like to subtract column 2 from column 1, column 4 from column 3, etc. for all rows.

I think I need to either try to write a function or use apply in some way.

Upvotes: 2

Views: 1023

Answers (5)

thelatemail
thelatemail

Reputation: 93813

R has vectorised operations to deal with this kind of task in a single call:

data[c(1,3)] - data[c(2,4)]
## or for every column until the end of the dataset
data[seq(1,ncol(data),2)] - data[seq(2,ncol(data),2)]
#  A C
#1 4 7
#2 3 6
#3 2 5
#4 1 4

See this previous discussion for lots of useful advice - Selecting multiple odd or even columns/rows for dataframe

You can extend this so the naming is done automatically:

s <- seq(1,ncol(data),2)
data[paste0(names(data[s]), "minus", names(data)[-s])] <- data[s] - data[-s]
data

#  A B C D AminusB CminusD
#1 6 2 9 2       4       7
#2 5 2 8 2       3       6
#3 4 2 7 2       2       5
#4 3 2 6 2       1       4

Upvotes: 6

Anders Ellern Bilgrau
Anders Ellern Bilgrau

Reputation: 10223

Many basic operations on data.frames are vectorized meaning that addition, subtraction, multiplication, etc, are element wise. I.e. the following works:

data <- data.frame(A = c(6,5,4,3), B = c(2,2,2,2), C = c(9,8,7,6), D = c(2,2,2,2))

data$AminusB <- data$A - data$B
data$CminusD <- data$C - data$D

print(data)
#  A B C D AminusB CminusD
#1 6 2 9 2       4       7
#2 5 2 8 2       3       6
#3 4 2 7 2       2       5
#4 3 2 6 2       1       4

You can also access column 4. say, by data[4] or data[,4] or data[,"D"] and more. See help("["). Depending on how you want your output, there are many options how to do it. With a simple for-loop you can loop over your columns and make all differences.

Upvotes: 3

M--
M--

Reputation: 28825

Just another approach using apply:

-t(apply(data, 1, diff))[ , seq(1, ncol(data)-1, by=2)]

#      B D
# [1,] 4 7
# [2,] 3 6
# [3,] 2 5
# [4,] 1 4

Upvotes: 2

DJV
DJV

Reputation: 4863

Having 170 columns, specifying every column name would be daunting. If all of your columns are numeric, you can do this:

#Sample data
set.seed(123)
df <- data.frame(x = floor(rnorm(5, 10, 2)),
                 y = floor(rnorm(5, 30, 2)),
                 z = floor(rnorm(5, 50, 2)))
   x  y  z
1  8 33 52
2  9 30 50
3 13 27 50
4 10 28 50
5 10 29 48    

Subtracting columns:

df[-1] - df[-ncol(df)]

  y  z
1 25 19
2 21 20
3 14 23
4 18 22
5 19 19

Upvotes: 2

IRTFM
IRTFM

Reputation: 263332

You can choose every other column with c(TRUE,FALSE) or its negation. The binary-minus has a dataframe method:

data[c(TRUE,FALSE)] - data[c(FALSE,TRUE)]
  A C
1 4 7
2 3 6
3 2 5
4 1 4

If you wanted to name then meaningfull you could use paste on the names:

 paste( names(data[c(TRUE,FALSE)]) , "_minus_", names( data[c(FALSE,TRUE)]) )

Upvotes: 4

Related Questions