David Resch
David Resch

Reputation: 113

Merging two Dataframes based on Time column

I am trying to merge two dataframes based on Time column. They each have differently formatted values for time. They look the following way:

Example datasets for Labourproductivity and Depressiondframe:

Labourproductivity <- read.csv(text="
Time,LabourProductivity
2004 Q1,96.6
Q2,96.9
Q3,96.9
Q4,97.1
2005 Q1,97.6
Q2,99.0")
Depressiondframe <- read.csv(text="
Time,DepressionCount
2004.00,875
2004.25,820
2004.50,785
2004.75,857
2005.00,844")

Data:

Labourproductivity
     Time LabourProductivity
1 2004 Q1               96.6
2      Q2               96.9
3      Q3               96.9
4      Q4               97.1
5 2005 Q1               97.6
6      Q2               99.0
Depressiondframe
     Time DepressionCount
1 2004.00             875
2 2004.25             820
3 2004.50             785
4 2004.75             857
5 2005.00             844

How would I be able to merge the two dataframes based on Time?

Upvotes: 2

Views: 188

Answers (1)

David Arenburg
David Arenburg

Reputation: 92300

Here's a possible approach. We will use the zoo package which will easily convert Depressiondframe$Time to the desired format using zoo::as.yearqtr function, while the second one is trickier and we will use stringi::stri_extract_first_regex and zoo::na.locf in order to handle it. Then, just a simple merge will finish the task

library(zoo)
library(stringi)
Depressiondframe$Time <- as.character(as.yearqtr(Depressiondframe$Time))
Labourproductivity$Time <- with(Labourproductivity,
                                paste(na.locf(stri_extract_first_regex(Time, "\\d{4}")), 
                                stri_extract_first_regex(Time, "Q\\d")))

merge(Depressiondframe, Labourproductivity, by = "Time", all = TRUE)
#      Time DepressionCount LabourProductivity
# 1 2004 Q1             875               96.6
# 2 2004 Q2             820               96.9
# 3 2004 Q3             785               96.9
# 4 2004 Q4             857               97.1
# 5 2005 Q1             844               97.6
# 6 2005 Q2              NA               99.0

Upvotes: 1

Related Questions