Reputation: 1527
I have the following table:
date status
1 2015-07-13 12:27:30 1
2 2015-07-22 14:36:09 1
3 2015-07-27 09:03:07 1
4 2015-07-27 17:06:04 1
5 2015-07-28 10:01:38 1
And want to aggregate the number of occurrences by day:
date status sum
1 2015-07-13 1 1
2 2015-07-22 1 1
3 2015-07-27 1 2
4 2015-07-28 1 1
Upvotes: 1
Views: 226
Reputation: 13580
Just for the sake of trying a base solution:
ave and aggregate
df1$sum <- ave(df1$status, as.Date(df1$date), FUN = "sum")
aggregate(df1[-1], list(as.Date(df1$date)), FUN=head, 1)
Output:
Group.1 status sum
1 2015-07-13 1 1
2 2015-07-22 1 1
3 2015-07-27 1 2
4 2015-07-28 1 1
ave and removing the duplicates after converting the date column
df1$sum <- ave(df1$status, as.Date(df1$date), FUN = "sum")
df1$date <- as.Date(df1$date)
df1[!duplicated(df1$date),]
Output:
date status sum
1 2015-07-13 1 1
2 2015-07-22 1 1
3 2015-07-27 1 2
5 2015-07-28 1 1
Upvotes: 0
Reputation: 887851
Assuming that the 'date' column is POSIXct
class, we can use dplyr
to aggregate by group. We group by 'date' after converting to Date
class and use summarise
to select the first
observation of 'status' and create the 'sum' column as the number of elements (n()
) per each group.
library(dplyr)
df2 <- df1 %>%
group_by(date=as.Date(date)) %>%
summarise(status= first(status), sum= n())
df2
# date status sum
#1 2015-07-13 1 1
#2 2015-07-22 1 1
#3 2015-07-27 1 2
#4 2015-07-28 1 1
We could also do this using data.table
. We convert the 'data.frame' to 'data.table' (setDT(df1)
), grouped by the 'date' column after conversion to Date
class, we select the first observation of 'status' and the number of elements (.N
) as the 'sum' column
setDT(df1)[,list(status=status[1L], sum=.N) , by = .(date=as.Date(date))]
# date status sum
#1: 2015-07-13 1 1
#2: 2015-07-22 1 1
#3: 2015-07-27 1 2
#4: 2015-07-28 1 1
df1 <- structure(list(date = structure(c(1436804850, 1437590169,
1438002187,
1438031164, 1438092098), class = c("POSIXct", "POSIXt"), tzone = ""),
status = c(1L, 1L, 1L, 1L, 1L)), .Names = c("date", "status"
), row.names = c("1", "2", "3", "4", "5"), class = "data.frame")
Upvotes: 1