user18801785
user18801785

Reputation:

Add column to dataframe based on information from another df2 in R

lets, say I have 2 dataframes like this:

Model <- c("H5", "H5", "H5","H4","H3")
Code <- c("001001", "001002","001003","001004","001005")
City <-  c("Mexico", "London", "NY", "Otawa", "Liverpool")

df1 <- data.frame(Model,Length,Code)


Model   Code       City
H5      001001     Mexico  
H5      001002     London
H5      001003     NY
H4      001004     Otawa
H3      001005     Liverpool

And

X <- c("030299", "010121","030448","030324","010245","001001", "001002","001003","001004","001005")
Y <- c("030344", "010222","030448","030001","010245","221001", "221044","221044","221004"," 001005")
Var1 <- c("H5", "H5", "H4","H4","H4","H5", "H5", "H5","H4","H3")
Var2 <- c("H4", "H2", "H4","H3","H4","H3", "H3", "H3","H3","H3")

  df2 <- data.frame(X,Y,Var1,Var2)

  X            Y     VAR1   VAR2
030299      030344    H5     H4
010121      010222    H5     H2
030448      030448    H4     H4
030324      030001    H4     H3
010245      010245    H4     H4
001001      221001    H5     H3
001002      221044    H5     H3
001003      221044    H5     H3
001004      221004    H4     H3
001005      001005    H3     H3

I want to code following:

For example if I select H3 as an argument in function, I want to take all values from 'Code' column in df1, take into account its corresponding value in 'Model' column and convert these value from 'Code' column based on df2 information. For example if we select the first row from df1 and set H3 as argument:

  H5      001001    Mexico 

function must take corresponding row from df2:

   X            Y     VAR1   VAR2
 001001      221001    H5     H3

and give me the output like this:

   X            Y    VAR2  City   
 001001      221001   H3   Mexico   

The final output should be like this:

  X            Y     VAR2   City 

001001      221001    H3   Mexico  
001002      221044    H3   London  
001003      221044    H3   NY
001004      221004    H3   Otawa  
001005      221056    H3   Liverpool 

Upvotes: 1

Views: 82

Answers (4)

jpdugo17
jpdugo17

Reputation: 7106

An alternative approach:

Data

library(tidyverse)

Model <- c("H5", "H5", "H5", "H4", "H3")
Code <- c("001001", "001002", "001003", "001004", "001005")
City <- c("Mexico", "London", "NY", "Otawa", "Liverpool")

df1 <- data.frame(Model, Code, City)

X <- c("030299", "010121", "030448", "030324", "010245", "001001", "001002", "001003", "001004", "001005")
Y <- c("030344", "010222", "030448", "030001", "010245", "221001", "221044", "221044", "221004", " 001005")
Var1 <- c("H5", "H5", "H4", "H4", "H4", "H5", "H5", "H5", "H4", "H3")
Var2 <- c("H4", "H2", "H4", "H3", "H4", "H3", "H3", "H3", "H3", "H3")

df2 <- data.frame(X, Y, Var1, Var2)

Function

my_fun <- function(row, var2) {
  df1_data <- df1 %>% slice(row)
  df2 %>%
    filter(Var2 == var2 & X == df1_data$Code) %>%
    mutate(df1_data$City)
}

1:nrow(df1) %>%
  map_dfr(~ my_fun(.x, "H3"))
#>        X       Y Var1 Var2 df1_data$City
#> 1 001001  221001   H5   H3        Mexico
#> 2 001002  221044   H5   H3        London
#> 3 001003  221044   H5   H3            NY
#> 4 001004  221004   H4   H3         Otawa
#> 5 001005  001005   H3   H3     Liverpool

Created on 2022-04-14 by the reprex package (v2.0.1)

Upvotes: 0

TarJae
TarJae

Reputation: 78947

Following your logic I tried to create a custom function with base R: It takes 3 arguments: df1, df2, x x is the number of rows you want to calculate. So you can select all columns or just one as you explained in your example.

my_function <- function(df1, df2, x){
select_row <- df1[x,]
cbind(df2[X==select_row[,2],c(1:2, 4)],select_row[3])
}

my_function(df1, df2, 1:5)
        X       Y Var2      City
6  001001  221001   H3    Mexico
7  001002  221044   H3    London
8  001003  221044   H3        NY
9  001004  221004   H3     Otawa
10 001005  001005   H3 Liverpool

Upvotes: 1

Wimpel
Wimpel

Reputation: 27742

Like this?

library(data.table)
setDT(df1);setDT(df2)
df2[df1, on = .(Var1 = Model, X = Code)]
#         X       Y Var1 Var2      City
# 1: 001001  221001   H5   H3    Mexico
# 2: 001002  221044   H5   H3    London
# 3: 001003  221044   H5   H3        NY
# 4: 001004  221004   H4   H3     Otawa
# 5: 001005  001005   H3   H3 Liverpool

Upvotes: 0

Julian
Julian

Reputation: 9260

Maybe something to begin with, this reproduces the result of your example.

df2 %>% 
  left_join(df1, by = c( "Var1" = "Model", "X" = "Code")) %>% 
  filter(Var2 == "H3", !is.na(City)) %>% 
  select(-Var1)

       X       Y Var2      City
1 001001  221001   H3    Mexico
2 001002  221044   H3    London
3 001003  221044   H3        NY
4 001004  221004   H3     Otawa
5 001005  001005   H3 Liverpool

Upvotes: 2

Related Questions