Reputation: 1707
Having data in a data.frame, I would like to aggregate some columns (using any general function) grouping by some others, keeping the remaining ones as they are (or even omitting them). The fashion is to recall the group by
function in SQL
. As an example let us assume we have
df <- data.frame(a=rnorm(4), b=rnorm(4), c=c("A", "B", "C", "A"))
and I want to sum (say) the values in column a
and average (say) the values in column b
, grouping by the symbols in column c
. I am aware it is possible to achieve such using apply
, cbind
or similars, specifying the functions you want to use, but I was wondering if there were a smarter (one line) way (especially using the aggregate
function) to do so.
Upvotes: 0
Views: 641
Reputation: 57
like this?
mapply(Vectorize(function(x, y) aggregate(
df[, x], by=list(df[, 3]), FUN=y), SIMPLIFY = F),
1:2, c('sum', 'mean'))
Upvotes: 1
Reputation: 332
Sorry but I don't follow how dealing with more than one column complicates things.
library(data.table)
dt <- data.table(df)
dt[,.(sum_a = sum(a),mean_b= mean(b)),by = c]
Upvotes: 2