Teo
Teo

Reputation: 23

Merge two data frames by row and column names and by group

I have two data frames, df1 and df2, that look as follows:

 df1<- data.frame(year, week, X1, X2)
 df1
  year week X1 X2
1 2010    1  2  3
2 2010    2  8  6
3 2011    1  7  5
 
 firm<-c("X1", "X1", "X2")
 year <- c(2010,2010,2011)
 week<- c(1, 2, 1)
 cost<-c(10,30,20)
 
 df2<- data.frame(firm,year, week, cost)
 df2
  firm year week cost
1   X1 2010    1   10
2   X1 2010    2   30
3   X2 2011    1   20

I'd like to merge these so the final result (i.e. df3) looks as follows:

df3 
  firm  year week cost Y 
1  X1   2010   1   10  2 
2  X1   2010   2   30  8 
3  X2   2011   1   20  5

Where "Y" is a new variable that reflects the values of X1 and X2 for a particular year and week found in df1. Is there a way to do this in R? Thank you in advance for your reply.

Upvotes: 2

Views: 90

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101538

Here is a base R option (borrow data from @akrun, thanks!)

q <- startsWith(names(df1),"X")
v <- cbind(df1[!q],stack(df1[q]),row.names = NULL)
df3 <- merge(setNames(v,c(names(df1)[!q],"Y","firm")),df2)

which gives

> df3
  year week firm Y cost
1 2010    1   X1 2   10
2 2010    2   X1 8   30
3 2011    1   X2 5   20

Upvotes: 0

akrun
akrun

Reputation: 887148

We can reshape the first dataset to 'long' format and then do a join with the second data

library(dplyr)
library(tidyr)
df1 %>% 
 pivot_longer(cols = X1:X2, values_to = 'Y', names_to = 'firm') %>% 
 right_join(df2)

-output

# A tibble: 3 x 5
#   year  week firm      Y  cost
#  <dbl> <dbl> <chr> <int> <dbl>
#1  2010     1 X1        2    10
#2  2010     2 X1        8    30
#3  2011     1 X2        5    20

data

df1 <- structure(list(year = c(2010L, 2010L, 2011L), week = c(1L, 2L, 
1L), X1 = c(2L, 8L, 7L), X2 = c(3L, 6L, 5L)), class = "data.frame", 
row.names = c("1", 
"2", "3"))

df2 <- structure(list(firm = c("X1", "X1", "X2"), year = c(2010, 2010, 
2011), week = c(1, 2, 1), cost = c(10, 30, 20)), class = "data.frame", 
row.names = c(NA, 
-3L))

Upvotes: 0

Related Questions