hasan arshad
hasan arshad

Reputation: 41

How to convert daily values into monthly for R

So I have a data with dates and hospital admissions. data is for everyday for two years. Data looks somewhat like this:

Date        cardioadmission   respiratoryadmission
2001-01-01        12                   06
2001-01-02        10                   5
2001-01-03        08                   4
2001-01-04        04                   6

I want to make a table of results like this

year    cvdadmissions   respiratoryadmissions

So i want to aggregate dates yearwise and then divide year by summer and winter. Let's say i want to see the results look like this:

year         cvdadmissions   respiratoryadmissions
2001            21                 22

so i want to report the admissions by months, not by each day. some sort of aggregate thing. Can someone please guide me about this

update:

summary <- data %>%
mutate(month = month(Date),  # what should i write in month and also in 
date
year = year(Date)) %>%  #same here what should i write in year and 
year(date)
group_by(month, year) %>%   # which month and by year which year. 
summarise(cvdadmission = sum(cvdadmission),
respiratoryadmission = sum(respiratoryadmission) # i have understood this part. 

Can you please explain the logic behind these in a little more detail.

Thanks

Upvotes: 0

Views: 3123

Answers (4)

niko
niko

Reputation: 5281

In base R you can use format add a year column

df$Year <- format(as.Date(df$Date), "%Y")
#         Date cardioadmission respiratoryadmission Year
# 1 2001-01-01              12                    6 2001
# 2 2001-01-02              10                    5 2001
# 3 2001-01-03               8                    4 2001
# 4 2001-01-04               4                    6 2001

Then you can proceed with the analysis. Here is an alternative to the provided approaches, using vapply

t(vapply(unique(df$Year), function(y) {
  i <- .subset2(df, ncol(df)) == y
  c(cardioadmission = sum(.subset2(df, 2L)), respiratoryadmission = sum(.subset2(df, 3L)))
}, numeric(2)))
#      cardioadmission respiratoryadmission
# 2001              34                   21 

Data

df <- structure(list(Date = structure(1:4, .Label = c("2001-01-01", 
                                                      "2001-01-02", "2001-01-03", "2001-01-04"), class = "factor"), 
                     cardioadmission = c(12, 10, 8, 4), respiratoryadmission = c(6, 
                                                                                 5, 4, 6)), class = "data.frame", row.names = c(NA, -4L))

Upvotes: 0

G. Grothendieck
G. Grothendieck

Reputation: 269586

Add a year/month or year column and aggregate by that:

library(zoo)

DFym <- transform(DF0, YearMon = as.yearmon(Date))[-1]
aggregate(. ~ YearMon, DFym, sum)
##    YearMon  cardioadmission respiratoryadmission
## 1 Jan 2001               34                   21

DFy <- transform(DF0, Year = as.integer(as.yearmon(Date)))[-1]
aggregate(. ~ Year, DFy, sum)
##   Year  cardioadmission respiratoryadmission
## 1 2001               34                   21

Another approach is to represent DF0 as a zoo time series:

library(zoo)

z <- read.zoo(DF0)

aggregate(z, as.yearmon, sum)
##          cardioadmission respiratoryadmission
## Jan 2001              34                   21

aggregate(z, function(x) as.integer(as.yearmon(x)), sum)
##      cardioadmission respiratoryadmission
## 2001              34                   21

Note

Lines <- "Date        cardioadmission   respiratoryadmission
2001-01-01        12                   06
2001-01-02        10                   5
2001-01-03        08                   4
2001-01-04        04                   6"
DF0 <- read.table(text = Lines, header = TRUE)
DF0$Date <- as.Date(DF0$Date)

Update

Fixed.

Upvotes: 1

Stephen Ewing
Stephen Ewing

Reputation: 48

Here's a tidyverse solution:

library(dplyr)
library(lubridate)

summary <- data %>%
    mutate(month = month(Date),
           year = year(Date)) %>%
    group_by(month, year) %>%
    summarise(cvdadmission = sum(cvdadmission),
              respiratoryadmission = sum(respiratoryadmission)

Upvotes: 0

Sonny
Sonny

Reputation: 3183

You can use dplyr and lubridate as shown below:

library(dplyr)
library(lubridate)
df %>%
  mutate(year = year(Date)) %>%
  summarise(cvdadmissions = sum(cardioadmission),
            respiratoryadmissions = sum(respiratoryadmission))

If you want to split to winter and summer, then you can mutate another field season by extracting month and use that in group_by(year, season)

Upvotes: 0

Related Questions