Luniz
Luniz

Reputation: 72

Check if a value is between two columns, spark scala

I have two dataframes, one with my data and another one to compare. What I want to do is check if a value is in a range of two different columns, for example:

Df_player
    +--------+-------+
    | Baller | Power |
    +--------+-------+
    | John   |   1.5 |
    | Bilbo  |   3.7 |
    | Frodo  |   6   |
    +--------+-------+

Df_Check
    +--------+--------+--------+
    | First  | Second | Value  |
    +--------+--------+--------+
    |   1    |   1.5  |  Bad-  |
    |   1.5  |   3    |  Bad   |
    |   3    |   4.2  |  Good  |
    |   4.2  |   6    |  Good+ |
    +--------+--------+--------+

The result would be:

Df_out
    +--------+-------+--------+
    | Baller | Power | Value  |
    +--------+-------+--------+
    | John   |   1.5 |  Bad-  |
    | Bilbo  |   3.7 |  Good  |
    | Frodo  |   6   |  Good+ |
    +--------+-------+--------+

Upvotes: 0

Views: 1561

Answers (1)

mck
mck

Reputation: 42392

You can do a join based on a between condition, but note that .between is not appropriate here because you want inequality in one of the comparisons:

val result = df_player.join(
    df_check, 
    df_player("Power") > df_check("First") && df_player("Power") <= df_check("Second"), 
    "left"
).select("Baller", "Power", "Value")

result.show
+------+-----+-----+
|Baller|Power|Value|
+------+-----+-----+
|  John|  1.5| Bad-|
| Bilbo|  3.7| Good|
| Frodo|  6.0|Good+|
+------+-----+-----+

Upvotes: 4

Related Questions