Reputation: 483
Suppose that we have two data frames as shown below:
df1 <- data.frame(Team1 = c("A","B","C"), Team2 = c("D","E","F"), Winner = c("A","E","F"))
df2 <- data.frame(Country = c("A","B","C","D","E","F"), Index = c(1,2,3,4,5,6))
What i want is create three columns in df2 as Team1_index, Team2_index, and Winner_index.
Team1 Team2 Winner Team1_index Team2_index Winner_index
A D A 1 4 1
B E E 2 5 5
C F F 3 6 6
I tried many ways but failed. Tips and advice!
Upvotes: 0
Views: 73
Reputation: 26333
Here is another option for you that uses match
and cbind
.
df3 <- as.matrix(df1)
colnames(df3) <- paste0(colnames(df3), "_index")
# match the positions
df3[] <- match(df3, df2$Country)
cbind(df1, df3)
# Team1 Team2 Winner Team1_index Team2_index Winner_index
#1 A D A 1 4 1
#2 B E E 2 5 5
#3 C F F 3 6 6
df3
is created as a matrix, i.e. a vector with dimensions attribute, such that we can replace its entries with the result of match
(a vector) right away and don't need to repeat the code for every column.
Or in one go
df1[paste0(colnames(df1), "_index")] <- match(as.matrix(df1), df2$Country)
Note however, that this ignores the index
column of df2
.
Thanks to @Moody_Mudskipper we could also write this more general as
df1[paste0(colnames(df1), "_index")] <- lapply(df1, function(x) df2$Index[match(x, df2$Country)])
Upvotes: 0
Reputation: 47300
To get new columns as factors :
df1[paste0(colnames(df1),"_index")] <- lapply(df1,factor,df2$Country,df2$Index)
# Team1 Team2 Winner Team1_index Team2_index Winner_index
# 1 A D A 1 4 1
# 2 B E E 2 5 5
# 3 C F F 3 6 6
To get new columns as numeric :
df1[paste0(colnames(df1),"_index")] <-
lapply(df1,function(x) as.numeric(as.character(factor(x,df2$Country,df2$Index))))
# Team1 Team2 Winner Team1_index Team2_index Winner_index
# 1 A D A 1 4 1
# 2 B E E 2 5 5
# 3 C F F 3 6 6
Note that for this specific case (index from 1 incremented by 1), this shorter version works:
df1[paste0(colnames(df1),"_index")] <-
lapply(df1,function(x) as.numeric(factor(x,df2$Country)))
Upvotes: 1
Reputation: 79188
You can use chartr
This will take into consideration both the country column and the index column:
df3=as.matrix(setNames(df1,paste0(names(df1),"_index")))
cbind(df1,chartr(paste0(df2$Country,collapse=""),paste0(df2$Index,collapse=""),df3))
Team1 Team2 Winner Team1_index Team2_index Winner_index
1 A D A 1 4 1
2 B E E 2 5 5
3 C F F 3 6 6
you can also do:
cbind(df1,do.call(chartr,c(as.list(sapply(unname(df2),paste,collapse="")),list(df3))))
Team1 Team2 Winner Team1_index Team2_index Winner_index
1 A D A 1 4 1
2 B E E 2 5 5
3 C F F 3 6 6
Upvotes: 0
Reputation: 764
This is basically the same as giocomai's answer, just uses purrr
to help eliminate duplication:
library(rlang)
library(dplyr)
getIndexCols <- function(df1, df2, colName){
idxColName <- sym(paste0(colName, "_Index"))
df1 %>% left_join(df2 %>% rename(!! sym(colName) := Country, !! idxColName := Index))
}
names(df1) %>% purrr::map(~ getIndexCols(df1, df2, .)) %>% reduce(~ left_join(.x, .y))
Upvotes: 0
Reputation: 5673
I have an almost solution with data.table, using melt
and dacst
to change shape
library(data.table)
df1 <- data.table(Team1 = c("A","B","C"), Team2 = c("D","E","F"), Winner = c("A","E","F"))
df2 <- data.table(Country = c("A","B","C","D","E","F"), Index = c(1,2,3,4,5,6))
melt(data = df1 , id.vars = )
plouf <- merge(df2,melt(df1,measure = 1:2), by.x = "Country", by.y = "value")
plouf[,winneridx := Index[Country == Winner]]
dcast(plouf,Country+winneridx~variable,value.var = "Index")
Country winneridx Team1 Team2
1: A 1 1 NA
2: B 5 2 NA
3: C 6 3 NA
4: D 1 NA 4
5: E 5 NA 5
6: F 6 NA 6
Upvotes: 0
Reputation: 3518
If you have more columns, you may look for more systematic solutions, but if it's really just three cases, this should do:
library("tidyverse")
df1 <- data.frame(Team1 = c("A","B","C"), Team2 = c("D","E","F"), Winner = c("A","E","F"))
df2 <- data.frame(Country = c("A","B","C","D","E","F"), Index = c(1,2,3,4,5,6))
df1 %>%
left_join(df2 %>% rename(Team1 = Country), by = "Team1") %>%
rename(Team1_Index = Index) %>%
left_join(df2 %>% rename(Team2 = Country), by = "Team2") %>%
rename(Team2_Index = Index) %>%
left_join(df2 %>% rename(Winner = Country), by = "Winner") %>%
rename(Winner_Index = Index)
#> Warning: Column `Team1` joining factors with different levels, coercing to
#> character vector
#> Warning: Column `Team2` joining factors with different levels, coercing to
#> character vector
#> Warning: Column `Winner` joining factors with different levels, coercing to
#> character vector
#> Team1 Team2 Winner Team1_Index Team2_Index Winner_Index
#> 1 A D A 1 4 1
#> 2 B E E 2 5 5
#> 3 C F F 3 6 6
You can safely ignore the warnings.
Upvotes: 1
Reputation: 1871
If you just have a small number of columns, you can use the match function as in the example:
df1$Team1_index <- df2$Index[match(df1$Team1, df2$Country)]
df1$Team2_index <- df2$Index[match(df1$Team2, df2$Country)]
df1$Winner_index <- df2$Index[match(df1$Winner, df2$Country)]
df1
Upvotes: 1