Dawid
Dawid

Reputation: 682

DataFrame filter on date

I have a Scala DataFrame (df) with a column of type date

dt:date
column1:string

I'm attempting to filter it to get only the records with dt greater or equal to a date which comes in the String format.

The code below seems to work but is it the "proper" way of doing it or should the string be explicitly converted to Date?

df.where($"dt" >= "2011-01-01")

Upvotes: 1

Views: 368

Answers (2)

Dawid
Dawid

Reputation: 682

@Achyuth rightly suggested to always convert to the date column of a dataframe.

df.where($"dt" >= "2011-01-01")

Gives the following query plan

+- *(1) Filter (isnotnull(dt#) && (cast(dt#as string) >= 2011-01-01))

Whereas

.where($"dt" >= lit(my_date_string).cast("timestamp"))

Does

 Filter (isnotnull(dt#) && (cast(dt#as timestamp) >= 1514764800000000))

Upvotes: 0

loneStar
loneStar

Reputation: 4010

It would be good to convert your string to data format of the column of dataframe.

So when user sends wrong input it will validate up-front. Instead of failing at this operation.

Upvotes: 3

Related Questions