Reputation: 23
I have two data frames, df1 and df2, that look as follows:
df1<- data.frame(year, week, X1, X2)
df1
year week X1 X2
1 2010 1 2 3
2 2010 2 8 6
3 2011 1 7 5
firm<-c("X1", "X1", "X2")
year <- c(2010,2010,2011)
week<- c(1, 2, 1)
cost<-c(10,30,20)
df2<- data.frame(firm,year, week, cost)
df2
firm year week cost
1 X1 2010 1 10
2 X1 2010 2 30
3 X2 2011 1 20
I'd like to merge these so the final result (i.e. df3) looks as follows:
df3
firm year week cost Y
1 X1 2010 1 10 2
2 X1 2010 2 30 8
3 X2 2011 1 20 5
Where "Y" is a new variable that reflects the values of X1 and X2 for a particular year and week found in df1. Is there a way to do this in R? Thank you in advance for your reply.
Upvotes: 2
Views: 90
Reputation: 101538
Here is a base R option (borrow data from @akrun, thanks!)
q <- startsWith(names(df1),"X")
v <- cbind(df1[!q],stack(df1[q]),row.names = NULL)
df3 <- merge(setNames(v,c(names(df1)[!q],"Y","firm")),df2)
which gives
> df3
year week firm Y cost
1 2010 1 X1 2 10
2 2010 2 X1 8 30
3 2011 1 X2 5 20
Upvotes: 0
Reputation: 887148
We can reshape the first dataset to 'long' format and then do a join with the second data
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(cols = X1:X2, values_to = 'Y', names_to = 'firm') %>%
right_join(df2)
-output
# A tibble: 3 x 5
# year week firm Y cost
# <dbl> <dbl> <chr> <int> <dbl>
#1 2010 1 X1 2 10
#2 2010 2 X1 8 30
#3 2011 1 X2 5 20
df1 <- structure(list(year = c(2010L, 2010L, 2011L), week = c(1L, 2L,
1L), X1 = c(2L, 8L, 7L), X2 = c(3L, 6L, 5L)), class = "data.frame",
row.names = c("1",
"2", "3"))
df2 <- structure(list(firm = c("X1", "X1", "X2"), year = c(2010, 2010,
2011), week = c(1, 2, 1), cost = c(10, 30, 20)), class = "data.frame",
row.names = c(NA,
-3L))
Upvotes: 0