Anthony Martin
Anthony Martin

Reputation: 787

How to calculate a mean on a datable in R based on several conditions

I have data like the following :

library(lubridate)
library(dplyr)
library(data.table)
MWE <- data.table(
  Date=rep(seq(ymd("2020-1-1"), ymd("2020-3-30"), by = "days"),each=6),
  Country=rep(c("France","United States","Germany"),90*6),
  TransportType=rep(c("Train","Cars"),each=3,90*3),
  Value=rnorm(90*6,2,3)
  )

I want to create a new variable, that is the mean of value :

So the mean should be calculated on January and February, but in the database for the whole period.

I have managed to do the first two (or I think so, I am checking) :

MWE_2 <- MWE %>%
  .[,JourSem:=weekdays(Date)] %>%
  .[,Moyenne:=mean(Value),by=.(Country,JourSem,TransportType)]

But I am unsure how to pass another condition in that. I think I get it form this

MWE_3 <- MWE %>%
  .[,JourSem:=weekdays(Date)] %>%
  .[Date <= "2020-02-29",Moyenne:=mean(Value),by=.(Country,JourSem,TransportType)]

But I lack the value for March dates, which is logical, as they are filtered out, which is therefore not what I want.

Upvotes: 0

Views: 89

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388797

We can first calculate mean for January and February month for each weekday and then join this data with March data.

library(data.table)

MWE[, JourSem:=weekdays(Date)]

d1 <- MWE[Date <= as.Date("2020-02-29")] %>%
        .[, .(Moyenne = mean(Value)), JourSem]

MWE[Date > as.Date("2020-02-29")][d1, on = 'JourSem']

Upvotes: 1

Related Questions