Ben10
Ben10

Reputation: 297

How to calculate the total amount depend on the date in r?

I am new to R and I have problems with calculating the amount of bill for each month. I have the dataframe as below:

dat <- data.frame(
  time = factor(c("Breakfast","Breakfast","Breakfast","Breakfast","Breakfast","Breakfast"), levels=c("Breakfast")), date=c("2020-01-20","2020-01-21","2020-01-22","2020-02-10","2020-02-11","2020-02-12"),
  total_bill = c(12.7557,14.8,17.23,15.7,16.9,13.2)
)

My goal is to calculate the amount spending on the Breakfast for each month so here we have two months and I want to get the total sum of January and February separately.

Any help for this would be much appreciated. Thank you!

Upvotes: 0

Views: 484

Answers (3)

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21440

Does this answer your question?

sums <- tapply(dat$total_bill, format(as.Date(dat$date), "%B"), sum)
February  January 
 45.8000  44.7857 

sumsis a list: so if you want to access, for example, the datum for February, you can do this:

sums[1]
February 
    45.8

Alternatively, you can convert sums into a dataframe and access the monthly sums via the month names:

sums <-  as.data.frame.list(tapply(dat$total_bill, format(as.Date(dat$date), "%B"), sum))
sums$February 
    45.8

Addition:

Another (fun) solution is via regex: you define the dates as a pattern and, using sub plus backreference \\1 to recall the two numbers between the dashes, reduce them to the months part:

tapply(dat$total_bill, sub("\\d{4}-(\\d{2})-\\d{2}", "\\1", dat$date), sum)
     01      02 
44.7857 45.8000  

Upvotes: 2

akrun
akrun

Reputation: 887891

We can convert the 'date' to Date class, get the month, and use that as grouping column and sum the 'total_bill'

library(dplyr)
dat %>%
    group_by(time, Month = format(as.Date(date), "%B")) %>% 
    summarise(total_bill = sum(total_bill, na.rm = TRUE))
# A tibble: 2 x 3
# Groups:   time [1]
#  time      Month    total_bill
#  <fct>     <chr>         <dbl>
#1 Breakfast February       45.8
#2 Breakfast January        44.8

We can convert it to 'wide' format, if that is needed

library(tidyr)
out <- dat %>%
     group_by(time, Month = format(as.Date(date), "%B")) %>% 
     summarise(total_bill = sum(total_bill, na.rm = TRUE)) %>% 
     pivot_wider(names_from = Month, values_from = total_bill)

out
# A tibble: 1 x 3
# Groups:   time [1] 
#   time      February January
#  <fct>        <dbl>   <dbl>
# 1 Breakfast     45.8    44.8

If we also need to group by 'year'

out <- dat %>%
     mutate(date = as.Date(date)) %>%
     group_by(time, Year = format(date, "%Y"), Month = format(date, "%B")) %>% 
     summarise(total_bill = sum(total_bill, na.rm = TRUE)) 

Upvotes: 1

desval
desval

Reputation: 2435

library(dplyr)
d_sum <- dat %>% 
  group_by(substr(date, 0, 7)) %>%
  summarise(sum = sum(total_bill))

d_sum
# A tibble: 2 x 2
  `substr(date, 0, 7)`   sum
  <chr>                <dbl>
1 2020-01               44.8
2 2020-02               45.8

Upvotes: 0

Related Questions