chu-js
chu-js

Reputation: 163

Using tapply across different permutations of columns

I have the following code:

ID1   ID2   ID3   Area
1     2     2     20
1     3     2     30
1     2     2     90
2     3     2     80
2     2     1     70
2     3     1     67
3     2     1     73

I hope to use:

tapply(df$area, list = c(df$ID1, df$ID2), sum)
tapply(df$area, list = c(df$ID1, df$ID3), sum)
tapply(df$area, list = c(df$ID2, df$ID3), sum)

Is there a way to shorten this code? I have to do this iteratively over different ID codes and so I hope to reduce it.

Upvotes: 1

Views: 56

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389055

Looks like you want to apply tapply to every combination of "ID" columns. We can select "ID" columns based on name and then use combn to create combination of column names and calculate sum with tapply for each combination.

cols <- grep("^ID", names(df), value = TRUE)
combn(cols, 2, function(x) tapply(df$Area, df[x], sum), simplify = FALSE)


#   ID2
#ID1   2   3
#  1 110  30
#  2  70 147
#  3  73  NA

#[[2]]
#   ID3
#ID1   1   2
#  1  NA 140
#  2 137  80
#  3  73  NA

#[[3]]
#   ID3
#ID2   1   2
#  2 143 110
#  3  67 110

Upvotes: 3

Related Questions