Reputation: 11
I a working on a little R project.
Given two data frames with different length:
df1 = data.frame(Plane.Id = c(19924519, 19924321, 19992436, 19924119, 19924208, 19924330),
Block.ID = c(090LC, 090LC, 001UG, 002LM, 001OI, 001UG),
Hour1 = c(0.02222222, 0.02222222, 15.07222, 15.44444, 6.652778, 3.286111))
df2 = data.frame(Block.Id = c(090LC, 001UG, 001UG, 002LM, 001OI),
Sector.ID = c(BIRDFIS, UKOVS, LLLLALL, EBBUEHS, LEBLDDN),
Hour_In = c(0.000000, 0.000000, 13.000000, 0.000000, 0.000000),
Hour_Out = c(23.50000, 13.000000, 23.50000, 23.50000, 23.50000))
Different Sector.ID are assigned to the same Block.ID depending on the hour of the day.
Is it possible to merge them into a single dataframe following the next conditions?:
What I am looking for is a dataframe with the lenght of df1 that contains the data Plane.ID, Block.ID and Sector.ID. Something like this (I do not know how to build a table in here, so I uploaded an image with the table):
df_final
I have tried with rbind, left_join, merge, cbind and nothing good came out. I even tried to do this with a loop but not a good idea.
Upvotes: 0
Views: 147
Reputation: 30474
Here is an alternative solution using data.table
:
library(data.table)
setDT(df1)
setDT(df2)
df1[df2, on = .(Block.ID, Hour1 >= Hour_In, Hour1 <= Hour_Out), .(Plane.Id, Block.ID, Sector.ID)]
Output
Plane.Id Block.ID Sector.ID
1: 19924519 090LC BIRDFIS
2: 19924321 090LC BIRDFIS
3: 19924330 001UG UKOVS
4: 19992436 001UG LLLLALL
5: 19924119 002LM EBBUEHS
6: 19924208 001OI LEBLDDN
Upvotes: 0
Reputation: 6446
How about inner joining on the "Block_id" and filtering by the "Hour1" using dplyr
?
df1 =
data.frame(
Plane.Id = c(19924519, 19924321, 19992436, 19924119, 19924208, 19924330),
Block.ID = c("090LC", "090LC", "001UG", "002LM", "001OI", "001UG"),
Hour1 = c(0.02222222, 0.02222222, 15.07222, 15.44444, 6.652778, 3.286111)
)
df2 = data.frame(
Block.ID = c("090LC", "001UG", "001UG", "002LM", "001OI"),
Sector.ID = c("BIRDFIS", "UKOVS", "LLLLALL", "EBBUEHS", "LEBLDDN"),
Hour_In = c(0.000000, 0.000000, 13.000000, 0.000000, 0.000000),
Hour_Out = c(23.50000, 13.000000, 23.50000, 23.50000, 23.50000)
)
dplyr::inner_join(df1, df2, by="Block.ID") %>%
dplyr::filter(Hour1 > Hour_In & Hour1 < Hour_Out)
Upvotes: 0