Reputation: 193
Let's say I have a list of data frames ldf:
df1 <- data.frame(date = c(1,2), value = c(4,5))
df2 <- data.frame(date = c(1,2), value = c(4,5))
ldf <- list(df1, df2)
What is the best way to get the sum (or any other function) of values by date, i.e. some data frame like:
data.frame(date = c(1,2), value = c(8,10))
Upvotes: 1
Views: 169
Reputation: 193507
Another option is to use unnest
from "tidyr" in conjunction with the typical grouping and aggregation functions via "dplyr":
library(dplyr)
library(tidyr)
unnest(ldf) %>%
group_by(date) %>%
summarise(value = sum(value))
# Source: local data frame [2 x 2]
#
# date value
# 1 1 8
# 2 2 10
Upvotes: 0
Reputation: 886938
You could use:
library(data.table)
dt1 <- rbindlist(ldf)
setkey(dt1,'date')
dt1[,list(value=sum(value)), by='date']
date value
1: 1 8
2: 2 10
Upvotes: 2
Reputation: 42629
If these rows were all in the same data frame, you would use aggregate
to do the sum. You can combine them with rbind
so they are in the same data frame:
aggregate(value ~ date, data=do.call(rbind, ldf), FUN=sum)
date value
1 1 8
2 2 10
If the date
columns in all the data frames are identical, you can easily use Reduce
to do the sum:
Reduce(function(x, y) data.frame(date=x$date, value=x$value+y$value), ldf)
date value
1 1 8
2 2 10
This should be a lot faster than rbind
-ing the data together and aggregating.
Upvotes: 1