Kuo-Hsien Chang
Kuo-Hsien Chang

Reputation: 935

extract rows from one dataframe to join another dataframe by finding closest value in certain columns in both dataframes

time_1 <- sort(sample(seq(from = 1, to = 800, by = 1), size = 100, replace = FALSE))
time_2 <- sort(sample(seq(from = 1, to = 800, by = 1), size = 500, replace = FALSE))
Var1 <- runif(100)
Var2 <- runif(500)
Var3 <- runif(500)

df1 <- data.frame(time_1,Var1)
df2 <- data.frame(time_2,Var2,Var3)

The column Var2 and Var3 in df2 need to be join df1.

The join is based on finding the closest value in the column time_1 and time_2 in df1 and df2, respectively.

How to complete this task in R? Any useful libraries or functions I can use?

Thanks!

Upvotes: 1

Views: 58

Answers (1)

MKR
MKR

Reputation: 20095

May be roll="nearest" can help you.

library(data.table)
setDT(df1)
setDT(df2)

df2[df1,on = .(time_2 = time_1), roll = "nearest"]

#Result (may vary since seed is not used)
     time_2  Var1   Var3  Var2
  1:      1 0.981 0.5264 0.423
  2:     10 0.607 0.4886 0.662
  3:     11 0.740 0.0809 0.857
  4:     18 0.647 0.9108 0.376
  5:     19 0.647 0.9108 0.712
 ---                          
496:    686 0.405 0.4537 0.248
497:    710 0.629 0.8922 0.376
498:    721 0.056 0.2882 0.280
499:    759 0.976 0.0520 0.806
500:    796 0.850 0.8068 0.980

Upvotes: 3

Related Questions