Reputation: 72
I have two dataframes, one with my data and another one to compare. What I want to do is check if a value is in a range of two different columns, for example:
Df_player
+--------+-------+
| Baller | Power |
+--------+-------+
| John | 1.5 |
| Bilbo | 3.7 |
| Frodo | 6 |
+--------+-------+
Df_Check
+--------+--------+--------+
| First | Second | Value |
+--------+--------+--------+
| 1 | 1.5 | Bad- |
| 1.5 | 3 | Bad |
| 3 | 4.2 | Good |
| 4.2 | 6 | Good+ |
+--------+--------+--------+
The result would be:
Df_out
+--------+-------+--------+
| Baller | Power | Value |
+--------+-------+--------+
| John | 1.5 | Bad- |
| Bilbo | 3.7 | Good |
| Frodo | 6 | Good+ |
+--------+-------+--------+
Upvotes: 0
Views: 1561
Reputation: 42392
You can do a join based on a between condition, but note that .between
is not appropriate here because you want inequality in one of the comparisons:
val result = df_player.join(
df_check,
df_player("Power") > df_check("First") && df_player("Power") <= df_check("Second"),
"left"
).select("Baller", "Power", "Value")
result.show
+------+-----+-----+
|Baller|Power|Value|
+------+-----+-----+
| John| 1.5| Bad-|
| Bilbo| 3.7| Good|
| Frodo| 6.0|Good+|
+------+-----+-----+
Upvotes: 4