Reputation: 741
I am trying to convert my date column in my spark dataframe from date
to np.datetime64
, how can I achieve that?
# this snippet convert string to date format
df1 = df.withColumn("data_date",to_date(col("data_date"),"yyyy-MM-dd"))
Upvotes: 2
Views: 4293
Reputation: 136
why do you want to do this . spark does not support the data type datetime64 and the provision of creating a User defined datatype is not available any more .Probably u can create a pandas Df and then do this conversion . Spark wont support it
Upvotes: 1
Reputation: 306
As you can see in the docs of spark https://spark.apache.org/docs/latest/sql-reference.html, the only types supported by times variables are TimestampType
and DateType
. Spark do not know how to handle a np.datetime64
type (think about what could spark know about numpy?-nothing).
You have already convert your string to a date format that spark know. My advise is, from there you should work with it as date
which is how spark will understand and do not worry there is a whole amount of built-in functions to deal with this type. Anything you can do with np.datetime64 in numpy you can in spark. Take a look at this post for more detail: https://mungingdata.com/apache-spark/dates-times/
Upvotes: 1