Reputation: 181
I have table as follows:
dt<-data.frame(Date=c("2011-01-16","2011-01-16","2011-07-08","2011-07-09","2011-07-09","2011-08-17","2011-09-10","2011-09-11","2011-09-11"),Number=c(7,7,NA,1,1,NA,7,5,6),Hour=c(0.25,0.25,NA,0.6,0.6,NA,2,0.25,0.25))
Date Number Hour
1 2011-01-16 7 0.25
2 2011-01-16 7 0.25
3 2011-07-08 NA NA
4 2011-07-09 1 0.60
5 2011-07-09 1 0.60
6 2011-08-17 NA NA
7 2011-09-10 7 2.00
8 2011-09-11 5 0.25
9 2011-09-11 6 0.25
I want to sum Hour
by Number
and Date
. The output looks like this:
Date "1" "5" "6" "7"
1 2011-01-16 NA NA NA 0.5
2 2011-07-08 NA NA NA NA
3 2011-07-09 1.2 NA NA NA
4 2011-08-17 NA NA NA NA
5 2011-09-10 NA NA NA 2.0
6 2011-09-11 NA 0.25 0.25 NA
Can you suggest me a function to get the output?
Upvotes: 1
Views: 943
Reputation: 887881
We can make use of the fun.aggregate
from dcast
library(data.table)
dcast(setDT(dt), Date + Hour ~ Number, sum)
If the OP intended to get NA
if there are no combinations, then create a condition because sum
returns 0 for length 0 (sum(integer(0))
)
dcast(setDT(dt), Date + Hour ~ Number, function(x)
if(length(x) == 0) NA_real_ else sum(x, na.rm = TRUE))[,
.(Date, Hour, `1`, `5`, `6`, `7`)]
#. Date Hour 1 5 6 7
#1: 2011-01-16 0.25 NA NA NA 0.5
#2: 2011-07-08 NA NA NA NA NA
#3: 2011-07-09 0.60 1.2 NA NA NA
#4: 2011-08-17 NA NA NA NA NA
#5: 2011-09-10 2.00 NA NA NA 2.0
#6: 2011-09-11 0.25 NA 0.25 0.25 NA
Upvotes: 2
Reputation: 389265
We can group_by
Date
and Number
and sum
Hour
for each group and use spread
to change it to wide format. However, this also gives NA
column (since Number
has NA
value) which can be removed if not needed.
library(dplyr)
dt %>%
group_by(Date, Number) %>%
summarise(Hour = sum(Hour, na.rm = TRUE)) %>%
tidyr::spread(Number, Hour) %>%
select(-`<NA>`)
# Date `1` `5` `6` `7`
# <fct> <dbl> <dbl> <dbl> <dbl>
#1 2011-01-16 NA NA NA 0.5
#2 2011-07-08 NA NA NA NA
#3 2011-07-09 1.2 NA NA NA
#4 2011-08-17 NA NA NA NA
#5 2011-09-10 NA NA NA 2
#6 2011-09-11 NA 0.25 0.25 NA
Upvotes: 1
Reputation: 1236
You can use aggregate function for this.
dt$Date <- as.character(dt$Date)
aggregate(dt$Hour, by = list(dt$Number, dt$Date), FUN = function(x) sum(x, na.rm = T))
Or, you can use this (without removing NA this time):
with(dt, aggregate(Hour, by = list(Number, Date), FUN = sum))
Upvotes: 2