Reputation: 682
I have a Scala DataFrame (df) with a column of type date
dt:date
column1:string
I'm attempting to filter it to get only the records with dt greater or equal to a date which comes in the String format.
The code below seems to work but is it the "proper" way of doing it or should the string be explicitly converted to Date?
df.where($"dt" >= "2011-01-01")
Upvotes: 1
Views: 368
Reputation: 682
@Achyuth rightly suggested to always convert to the date column of a dataframe.
df.where($"dt" >= "2011-01-01")
Gives the following query plan
+- *(1) Filter (isnotnull(dt#) && (cast(dt#as string) >= 2011-01-01))
Whereas
.where($"dt" >= lit(my_date_string).cast("timestamp"))
Does
Filter (isnotnull(dt#) && (cast(dt#as timestamp) >= 1514764800000000))
Upvotes: 0
Reputation: 4010
It would be good to convert your string to data format of the column of dataframe.
So when user sends wrong input it will validate up-front. Instead of failing at this operation.
Upvotes: 3