Reputation: 412
Input:
a 3 hi
a 4 hi
a NA hi
b 7 lo
b 2 lo
b 3 lo
c 1 hi
c 6 hi
Desired output:
a 7 hi
b 12 lo
c 7 hi
Basically I would like to obtain the sum of the second column while column 1 takes on a unique value. I would also like to obtain the string in column 3 associated with each unique value in column 1.
Upvotes: 0
Views: 80
Reputation: 31181
Using data.table
:
> library(data.table)
> setDF(dat)
> dat[,list(sum(numbers, na.rm=T), unique(chars)), by=letters]
letters V1 V2
1: a 7 hi
2: b 12 lo
3: c 7 hi
Upvotes: 1
Reputation: 513
dat <- data.frame(letters = c('a', 'a', 'a', 'b', 'b', 'b', 'c', 'c'), numbers = c(3, 4, NA, 7, 2, 3, 1, 6), chars = c("hi", "hi", "hi", "lo", "lo", "lo", "hi", "hi"))
Using dplyr
:
library(dplyr)
dat %>%
group_by(letters, chars) %>%
summarise(n = sum(numbers, na.rm = TRUE))
Source: local data frame [3 x 3]
Groups: letters
letters chars n
1 a hi 7
2 b lo 12
3 c hi 7
Using plyr
:
library(plyr)
ddply(dat, c("letters", "chars"), 'summarise', n = sum(numbers, na.rm = TRUE))
letters chars summarise
1 a hi 7
2 b lo 12
3 c hi 7
You basically want some variant of the split-apply-combine method.
Upvotes: 1