Reputation: 13035
There is a dataframe shown as follows, it has two columns.
df.show()
| Time| MinTime|
|2019-11-19 23:00:...|2019-11-19 23:00:...|
|2019-11-19 23:15:...|2019-11-19 23:00:...|
|2019-11-19 23:30:...|2019-11-19 23:00:...|
root
|-- Time: string (nullable = true)
|-- MinTime: string (nullable = true)
df.show(truncate=False)
| Time| MinTime|
|2019-11-19 23:00:000000|2019-11-19 23:00:000000|
|2019-11-19 23:15:000000|2019-11-19 23:00:000000|
|2019-11-19 23:30:000000|2019-11-19 23:00:000000|
After I use the following line of code to process the above column, the values for column Offset
are all null. Based on the values in Time
and MinTime
, the difference should not be null for all the rows. May I know the reason for this?
df= df.withColumn('Offset',((col('Time').cast('long') - col('MinTime').cast('long'))))
df.show()
| Time| MinTime| Offset|
|2019-11-19 23:00:...|2019-11-19 23:00:...| null|
|2019-11-19 23:15:...|2019-11-19 23:00:...| null|
|2019-11-19 23:30:...|2019-11-19 23:00:...| null|
df.printSchema()
root
|-- Time: string (nullable = true)
|-- MinTime: string (nullable = true)
|-- Offset: long (nullable = true)
df.show(truncate=False)
| Time| MinTime| Offset|
|2019-11-19 23:00:000000|2019-11-19 23:00:000000| null|
|2019-11-19 23:15:000000|2019-11-19 23:00:000000| null|
|2019-11-19 23:30:000000|2019-11-19 23:00:000000| null|
Upvotes: 0
Views: 257
Reputation: 23109
Please check your schema of your df
, if the columns type is String
it has to be converted to timestamp
first
You can use to_timestamp
function to convert the datatype to timestamp
first as
date_format = 'yyyy-MM-dd HH:mm:ss'
df.withColumn('Offset',
(f.to_timestamp('Time', date_format).cast('long') - f.to_timestamp('MinTime').cast('long'))) \
.show(truncate=False)
Result:
+-------------------+-------------------+------+
|Time |MinTime |Offset|
+-------------------+-------------------+------+
|2019-11-19 23:00:00|2019-11-19 23:00:00|0 |
|2019-11-19 23:15:00|2019-11-19 23:00:00|900 |
|2019-11-19 23:30:00|2019-11-19 23:00:00|1800 |
+-------------------+-------------------+------+
Please make sure to use the correct date format.
Upvotes: 1