Sarkans Satans
Sarkans Satans

Reputation: 63

Add a column to a data frame based on a condition

Suppose I have 2 data frames:

library(tidyverse)
quaters <- tibble(
  quarter = c("Q1", "Q2","Q3", "Q4"),
  start_date = as.Date(c("2020-4-20", "2020-7-21", "2020-10-22", "2021-1-22")),
  end_date = as.Date(c("2020-7-20", "2020-10-21", "2021-1-22", "2021-4-22"))
)

test_data <- tibble(city =c("Paris", "London", "Rome"),
     date = as.Date(c("2020-6-25", "2020-9-12", "2020-10-11")))

I want to define a quarter for each date in "test_data" based on "quarters" and add it as a column to "test_data". Thus, Paris should be Q1, London - Q2, Rome - Q3. Can I do it just with mutate or smth like that? I can do it with a function and sapply but it takes a lot of time for R to calculate large datasets and I'm sure this is not the best solution (the quarter function doesn't help because I want to define start and end dates of a quarter manually). Thank you!

Upvotes: 1

Views: 41

Answers (1)

akrun
akrun

Reputation: 887821

We can use a non-equi join from data.table

library(data.table)
setDT(test_data)[as.data.table(quaters), 
    quarter := i.quarter,on = .(date >= start_date, date <= end_date)]

-output

test_data
#     city       date quarter    
#1:  Paris 2020-06-25      Q1
#2: London 2020-09-12      Q2
#3:   Rome 2020-10-11      Q2

Upvotes: 1

Related Questions