Zslice
Zslice

Reputation: 412

Sum over a column while another column takes a particular value

Input:

a   3   hi
a   4   hi
a   NA  hi
b   7   lo
b   2   lo
b   3   lo
c   1   hi
c   6   hi

Desired output:

a   7   hi
b   12  lo
c   7   hi

Basically I would like to obtain the sum of the second column while column 1 takes on a unique value. I would also like to obtain the string in column 3 associated with each unique value in column 1.

Upvotes: 0

Views: 80

Answers (2)

Colonel Beauvel
Colonel Beauvel

Reputation: 31181

Using data.table:

> library(data.table)
> setDF(dat)
> dat[,list(sum(numbers, na.rm=T), unique(chars)), by=letters]
   letters V1 V2
1:       a  7 hi
2:       b 12 lo
3:       c  7 hi

Upvotes: 1

Frank P.
Frank P.

Reputation: 513

dat <- data.frame(letters = c('a', 'a', 'a', 'b', 'b', 'b', 'c', 'c'), numbers = c(3, 4, NA, 7, 2, 3, 1, 6), chars = c("hi", "hi", "hi", "lo", "lo", "lo", "hi", "hi"))

Using dplyr:

library(dplyr)

dat %>%
  group_by(letters, chars) %>%
  summarise(n = sum(numbers, na.rm = TRUE))

Source: local data frame [3 x 3]
Groups: letters

  letters chars  n
1       a    hi  7
2       b    lo 12
3       c    hi  7

Using plyr:

library(plyr)

ddply(dat, c("letters", "chars"), 'summarise', n = sum(numbers, na.rm = TRUE))

  letters chars summarise
1       a    hi         7
2       b    lo        12
3       c    hi         7

You basically want some variant of the split-apply-combine method.

Upvotes: 1

Related Questions