Kaizen
Kaizen

Reputation: 151

How to match column value in a dataframe based on condition in another dataframe?

I have two dataframes with different number of rows, simple example below:

df= data.frame(Xmin=c(10,15),Xmax=c(20,20),Ymin=c(10,20),Ymax=c(20,25),ID=c(1,2))
df
  Xmin Xmax Ymin Ymax ID
1   10   20   10   20  1
2   15   20   20   25  2
df2=data.frame(Xmin=c(13,15,17),Xmax=c(17,17,19),Ymin=c(12,21,20),Ymax=c(18,25,22),ID=c(NA,NA,NA))

df2
  Xmin Xmax   Ymin Ymax ID
1   13   17     12   18 NA
2   15   17     21   25 NA
3   17   19     20   22 NA

I want to replace the column ID of df2 with the corresponding value in ID column of df for the rows matching the condition.

df2$Xmin >= df$Xmin & df2$Xmax<=df$Xmax & df2$Ymin >= df$Ymin & df2$Ymax<=df$Ymax

The desired output is

  Xmin Xmax   Ymin Ymax ID
1   13   17     12   18  1
2   15   17     21   25  2
3   17   19     20   22  2

Is there a simple way to achieve this?

Upvotes: 1

Views: 1579

Answers (3)

akrun
akrun

Reputation: 886938

We may use a non-equi join here

library(data.table)
df2$ID <- NULL
setDT(df2)[df, ID := i.ID, 
 on = .(Xmin >= Xmin, Xmax <= Xmax, Ymin >= Ymin, Ymax <= Ymax)]

-output

> df2
    Xmin  Xmax  Ymin  Ymax    ID
   <num> <num> <num> <num> <num>
1:    13    17    12    18     1
2:    15    17    21    25     2
3:    17    19    20    22     2

Upvotes: 1

PaulS
PaulS

Reputation: 25313

Another possible solution, tidyverse-based:

library(tidyverse)

df2 %>% 
  rowwise() %>% 
  mutate(ID = df[Xmax <= df$Xmax & Xmin >= df$Xmin & Ymax <= df$Ymax & Ymin >= df$Ymin,"ID"][1]) %>% 
  ungroup

#> # A tibble: 3 x 5
#>    Xmin  Xmax  Ymin  Ymax    ID
#>   <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1    13    17    12    18     1
#> 2    15    17    21    25     2
#> 3    17    19    20    22     2

Upvotes: 2

Grzegorz Sapijaszko
Grzegorz Sapijaszko

Reputation: 3604

Something like with ifelse():

df= data.frame(Xmin=c(10,20),Xmax=c(20,20),Ymin=c(10,20),Ymax=c(20,25),ID=c(1,2))
df2=data.frame(Xmin=c(13,15,17),Xmax=c(17,17,19),Ymin=c(12,21,20),Ymax=c(18,25,22),ID=c(NA,NA,NA))

df2$ID = ifelse(df2$Xmin >= df$Xmin & df2$Xmax<=df$Xmax & df2$Ymin >= df$Ymin & df2$Ymax<=df$Ymax, df$ID, NA)

df2
#>   Xmin Xmax Ymin Ymax ID
#> 1   13   17   12   18  1
#> 2   15   17   21   25 NA
#> 3   17   19   20   22 NA

Created on 2022-03-07 by the reprex package (v2.0.1)

Upvotes: 0

Related Questions