user555265
user555265

Reputation: 493

R merge based on condition other than equality

I have a dataframe that looks something like:

date            minutes_since_midnight   value
2015-01-01      50                       2
2015-01-01      60                       1.5
2015-01-02      45                       3.3
2015-01-03      99                       5.5

and another dataframe looking something like this

date        minutes_since_midnight   other_value
2015-01-01  55                       12
2015-01-01  80                       33
2015-01-02  45                       88

What I want to do is add another column to the first data frame, which is the boolean value whether a row exists in the second data frame for an equal value in the date column and then a minutes_since_midnight which is less than or equal to the minutes_since_midnight from the first data frame. So for the above example data we'd get:

date        minutes_since_midnight    value  has_other_value
2015-01-01  50                        2      False
2015-01-01  60                        1.5    True
2015-01-02  45                        3.3    True
2015-01-03  99                        5.5    False

How can I do this?

Hope this makes sense,

Thanks in advance

Upvotes: 8

Views: 6869

Answers (2)

Sam Firke
Sam Firke

Reputation: 23034

I would probably join the data.frames along the lines of the other answer, then create the variable and drop unneeded columns. But here's an option using the dplyr package to perform the steps as you describe them:

library(dplyr)
df1$has_other_value <-
  left_join(df1, df2 %>%
              group_by(date) %>%
              summarise(minMins = min(minutes_since_midnight)),
            by="date")$minMins <= df1$minutes_since_midnight

df1$has_other_value[is.na(df1$has_other_value)] <- FALSE

Result:

        date minutes_since_midnight value has_other_value
1 2015-01-01                     50   2.0           FALSE
2 2015-01-01                     60   1.5            TRUE
3 2015-01-02                     45   3.3            TRUE
4 2015-01-03                     99   5.5           FALSE

Upvotes: 5

figurine
figurine

Reputation: 756

Can you not rename the variables minutes_since_midnight to minutes_since_midnight1 and minutes_since_midnight2, merge the two data frames together then create the required has_other_value variable with an if else statement.

Upvotes: 2

Related Questions