Aggregate data by multiple groups

Question

My questions is best described with an example:

data:

v1 <- c("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p")
v2 <- c("aa","bb","cc","dd","ee","ff","gg","hh","ii","jj","kk","ll","mm","nn","oo","pp")
v3 <- c("aaa","bbb","ccc","ddd","eee","fff","ggg","hhh","iii","jjj","kkk","lll","mmm","nnn","ooo","ppp")
values <- 1:10000
y <- data.frame(var1=sample(v1,size = 100,replace=T),
                var2=sample(v2,size=100,replace=T),
                var3=sample(v3,size=100,replace=T),
                val1=sample(values,size=100,replace=T),
                val2=sample(values,size=100,replace=T))


head(y)
  var1 var2 var3 val1 val2
1    h   hh  lll 4832 9547
2    h   nn  eee  727 6382
3    h   bb  jjj 9985 8041
4    e   bb  jjj 1438  280
5    k   ff  nnn 3094 4316
6    f   cc  kkk 7426 3588

Basically I need to sum val1 and val2 by all unique combinations of var1,var2, and var3. So, none of the rows in the head statement above would get summed together.

In the actual data I'm working with, there are thousands of different values for var1, var2, and var3. Additionally, there are a varying number of columns I want to group by. This is going to be used in a shiny app where the user selects any number of groups that he/she wants to aggregate by. I've tried using stats::aggregate but it doesn't seem to be versatile enough for this.

If you need anything else to help answer my question let me know.

EDIT: Very helpful responses so far and thank you for that, but they aren't quite what I'm looking for (my fault, I didn't state the question clear enough).

The issue is automating my script so it can aggregate the example above as well as the following:

head(y)
  var1 var2 var3 var4 var5 val1 val2 val3
1    p   pp  nnn   ll  aaa 3914 6621 8968
2    d   ii  jjj   pp  aaa 5885  955 7560
3    j   bb  ppp   aa  ccc 7979 5602 4639
4    e   mm  ooo   mm  fff 8746 8612 2212
5    g   ee  aaa   kk  lll 4795 6438 6318
6    f   mm  kkk   ii  ggg 7550   90 1173

In my shiny app the data could just as easily look like this, or some other variation with any number of columns. I know the index values for the variable and value columns, but not their names. So I can use the index values, but they are changing. Also I can't really hard code in any columns since they are always changing.

I hope this helps!

Aggregate data by multiple groups

Answers (1)

Related Questions