Reputation: 1293
Why do the following two lines of code produce a different result?
email_response.filter(f"first_response_date > date'2020-11-2'")
The above returns 176671203 rows.
email_response.filter(F.col("first_response_date") > F.lit("2020-11-2")).count()
The above returns 52063066 rows.
The logic appears identical, why do the results differ?
Upvotes: 0
Views: 90
Reputation: 42352
The second line is comparing the column to a string "2020-11-2"
, not a date. If you add a .cast("date")
to the second line, I guess you will get the same answer.
email_response.filter(F.col("first_response_date") > F.lit("2020-11-2").cast("date")).count()
Upvotes: 1