Rafael
Rafael

Reputation: 617

Summing cells of some rows and columns

I have a large data frame where some rows have repeated values in some of their columns. I want to keep the repeated values and sum those which are different. Below there is a sample of my data:

data<-data.frame(season=c(2008,2009,2010,2011,2011,2012,2000,2001),
             lic=c(132228,140610,149215,158559,158559,944907,37667,45724),
             client=c(174,174,174,174,174,174,175,175),
             qtty=c(31,31,31,31,31,31,36,26),
             held=c(60,65,58,68,68,70,29,23),
             catch=c(7904,6761,9236,9323.2,801,NA,2330,3594.5),
             potlift=c(2715,2218,3000,3887,750,NA,2314,3472))

.

season  lic client  qtty    held    catch   potlift
2008    132228  174 31  60  7904    2715
2009    140610  174 31  65  6761    2218
2010    149215  174 31  58  9236    3000
2011    158559  174 31  68  9323.2  3887
2011    158559  174 31  68  801 750
2012    944907  174 31  70  NA  NA
2000    37667   175 36  29  2330    2314
2001    45724   175 26  23  3594.5  3472

Note that the season 2011 is repeated, each variable (client... held), except catch and potlift. I need to keep the values of (client... held) and sum catch and potlift; therefore my new data frame should be like the example below:

    season  lic client  qtty    held    catch   potlift
2008    132228  174 31  60  7904    2715
2009    140610  174 31  65  6761    2218
2010    149215  174 31  58  9236    3000
2011    158559  174 31  68  10124.2 4637
2012    944907  174 31  70  NA  NA
2000    37667   175 36  29  2330    2314
2001    45724   175 26  23  3594.5  3472

I have attempted to do so using aggregate, but this function sum everything. Any help will be appreciated.

Upvotes: 1

Views: 129

Answers (2)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

aggregate seems to work fine for me, but I'm not sure what you were trying:

> aggregate(cbind(catch, potlift) ~ ., data, sum, na.action = "na.pass")
  season    lic client qtty held   catch potlift
1   2001  45724    175   26   23  3594.5    3472
2   2000  37667    175   36   29  2330.0    2314
3   2010 149215    174   31   58  9236.0    3000
4   2008 132228    174   31   60  7904.0    2715
5   2009 140610    174   31   65  6761.0    2218
6   2011 158559    174   31   68 10124.2    4637
7   2012 944907    174   31   70      NA      NA

Here, use cbind to identify the columns that you want to aggregate by. You can then specify all the other columns, or just use . to indicate "use all other columns not mentioned in the cbind call.

Upvotes: 2

Thomas
Thomas

Reputation: 44525

data$catch <- with(data, ave(catch,list(lic,client,qtty,held),FUN=sum))
data$potlift <- with(data, ave(potlift,list(lic,client,qtty,held),FUN=sum))
unique(data)
  season    lic client qtty held   catch potlift
1   2008 132228    174   31   60  7904.0    2715
2   2009 140610    174   31   65  6761.0    2218
3   2010 149215    174   31   58  9236.0    3000
4   2011 158559    174   31   68 10124.2    4637
6   2012 944907    174   31   70      NA      NA
7   2000  37667    175   36   29  2330.0    2314
8   2001  45724    175   26   23  3594.5    3472

Upvotes: 2

Related Questions