Reputation: 43
import pyspark.sql.functions as F
from datetime import datetime
data = [
(1, datetime(2017, 3, 12, 3, 19, 58), 'Raising',2),
(2, datetime(2017, 3, 12, 3, 21, 30), 'sleeping',1),
(3, datetime(2017, 3, 12, 3, 29, 40), 'walking',3),
(4, datetime(2017, 3, 12, 3, 31, 23), 'talking',5),
(5, datetime(2017, 3, 12, 4, 19, 47), 'eating',6),
(6, datetime(2017, 3, 12, 4, 33, 51), 'working',7),
]
df.show()
| id| testing_time|test_name|shift|
| 1|2017-03-12 03:19:58| Raising| 2|
| 2|2017-03-12 03:21:30| sleeping| 1|
| 3|2017-03-12 03:29:40| walking| 3|
| 4|2017-03-12 03:31:23| talking| 5|
| 5|2017-03-12 04:19:47| eating| 6|
| 6|2017-03-12 04:33:51| working| 7|
Now I want to add shift (hours) to the testing time. Can anybody help me out with a quick solution?
Upvotes: 1
Views: 5399
Reputation: 2200
You can use something like below. You need to convert shift field to seconds so I multiply it with 3600
>>> df.withColumn("testing_time", (F.unix_timestamp("testing_time") + F.col("shift")*3600).cast('timestamp')).show()
+---+-------------------+---------+-----+
| id| testing_time|test_name|shift|
+---+-------------------+---------+-----+
| 1|2017-03-12 05:19:58| Raising| 2|
| 2|2017-03-12 04:21:30| sleeping| 1|
| 3|2017-03-12 06:29:40| walking| 3|
| 4|2017-03-12 08:31:23| talking| 5|
| 5|2017-03-12 10:19:47| eating| 6|
| 6|2017-03-12 11:33:51| working| 7|
+---+-------------------+---------+-----+
Upvotes: 6