Reputation: 3456
Why isn't the following code working? I am trying to filter out rows such that they contain values in: [10.0, 100.0]. Both of the following solutions produce the same result. Do I need to
Cast()` or something?
Solution 1:
dff1.select("hrs").filter(col("hrs").geq(lit("10")) &&
col("hrs").leq(lit("100")) ).show(10, truncate = false)
Solution 2:
dff1.select("hrs").filter(col("hrs") >= lit("10") &&
col("hrs") <= lit("100") ).show(10, truncate = false)
Result:
+------------------+
|hrs |
+------------------+
|239.78444444444443|
|24.459444444444443|
|238.05944444444444|
|45.05138888888889 |
|213.6225 |
|20.04388888888889 |
|201.45333333333335|
|4393.384166666667 |
|260.2611111111111 |
|47.83083333333333 |
+------------------+
Upvotes: 0
Views: 1413
Reputation: 8711
Better to use expressions for the filter. The expression would be the same as you use it in SQL "where" clause (leave the integers/floats as such and wrap the string constants in single quotes).
So your transformation becomes.
dff1.select("hrs").filter(" hrs >= 10 and hrs <= 100 ")
Upvotes: 1
Reputation: 42352
lit
is not necessary for integers or floats:
dff1.select("hrs").filter(col("hrs") >= 10 && col("hrs") <= 100)
should also work.
Upvotes: 1