Reputation: 41
So I have a data with dates and hospital admissions. data is for everyday for two years. Data looks somewhat like this:
Date cardioadmission respiratoryadmission
2001-01-01 12 06
2001-01-02 10 5
2001-01-03 08 4
2001-01-04 04 6
I want to make a table of results like this
year cvdadmissions respiratoryadmissions
So i want to aggregate dates yearwise and then divide year by summer and winter. Let's say i want to see the results look like this:
year cvdadmissions respiratoryadmissions
2001 21 22
so i want to report the admissions by months, not by each day. some sort of aggregate thing. Can someone please guide me about this
update:
summary <- data %>%
mutate(month = month(Date), # what should i write in month and also in
date
year = year(Date)) %>% #same here what should i write in year and
year(date)
group_by(month, year) %>% # which month and by year which year.
summarise(cvdadmission = sum(cvdadmission),
respiratoryadmission = sum(respiratoryadmission) # i have understood this part.
Can you please explain the logic behind these in a little more detail.
Thanks
Upvotes: 0
Views: 3123
Reputation: 5281
In base R you can use format
add a year column
df$Year <- format(as.Date(df$Date), "%Y")
# Date cardioadmission respiratoryadmission Year
# 1 2001-01-01 12 6 2001
# 2 2001-01-02 10 5 2001
# 3 2001-01-03 8 4 2001
# 4 2001-01-04 4 6 2001
Then you can proceed with the analysis. Here is an alternative to the provided approaches, using vapply
t(vapply(unique(df$Year), function(y) {
i <- .subset2(df, ncol(df)) == y
c(cardioadmission = sum(.subset2(df, 2L)), respiratoryadmission = sum(.subset2(df, 3L)))
}, numeric(2)))
# cardioadmission respiratoryadmission
# 2001 34 21
Data
df <- structure(list(Date = structure(1:4, .Label = c("2001-01-01",
"2001-01-02", "2001-01-03", "2001-01-04"), class = "factor"),
cardioadmission = c(12, 10, 8, 4), respiratoryadmission = c(6,
5, 4, 6)), class = "data.frame", row.names = c(NA, -4L))
Upvotes: 0
Reputation: 269586
Add a year/month or year column and aggregate by that:
library(zoo)
DFym <- transform(DF0, YearMon = as.yearmon(Date))[-1]
aggregate(. ~ YearMon, DFym, sum)
## YearMon cardioadmission respiratoryadmission
## 1 Jan 2001 34 21
DFy <- transform(DF0, Year = as.integer(as.yearmon(Date)))[-1]
aggregate(. ~ Year, DFy, sum)
## Year cardioadmission respiratoryadmission
## 1 2001 34 21
Another approach is to represent DF0 as a zoo time series:
library(zoo)
z <- read.zoo(DF0)
aggregate(z, as.yearmon, sum)
## cardioadmission respiratoryadmission
## Jan 2001 34 21
aggregate(z, function(x) as.integer(as.yearmon(x)), sum)
## cardioadmission respiratoryadmission
## 2001 34 21
Lines <- "Date cardioadmission respiratoryadmission
2001-01-01 12 06
2001-01-02 10 5
2001-01-03 08 4
2001-01-04 04 6"
DF0 <- read.table(text = Lines, header = TRUE)
DF0$Date <- as.Date(DF0$Date)
Fixed.
Upvotes: 1
Reputation: 48
Here's a tidyverse solution:
library(dplyr)
library(lubridate)
summary <- data %>%
mutate(month = month(Date),
year = year(Date)) %>%
group_by(month, year) %>%
summarise(cvdadmission = sum(cvdadmission),
respiratoryadmission = sum(respiratoryadmission)
Upvotes: 0
Reputation: 3183
You can use dplyr
and lubridate
as shown below:
library(dplyr)
library(lubridate)
df %>%
mutate(year = year(Date)) %>%
summarise(cvdadmissions = sum(cardioadmission),
respiratoryadmissions = sum(respiratoryadmission))
If you want to split to winter and summer, then you can mutate
another field season
by extracting month
and use that in group_by(year, season)
Upvotes: 0