Reputation: 336
Due to the way Pandas and Athena read datetime formats, I've had to manipulate several columns in order to get the data in a format that works.
Which involves converting a string to a datetime
df["serviceStartTimestamp"] = pd.to_datetime(df["serviceStartTimestamp"])
returning
0 2019-08-07 08:04:43.942000+00:00
1 2019-08-07 08:13:48.397000+00:00
2 2019-08-07 09:02:29.067000+00:00
then running this,
df["serviceStartTimestamp"] = pd.to_datetime(df["serviceStartTimestamp"].astype("str").str.split("+", expand=True)[0])
returning
0 2019-08-07 08:04:43.942
1 2019-08-07 08:13:48.397
2 2019-08-07 09:02:29.067
If the datetime columns include any further digits they get completed messed up in athena, but how I'm currently solving it seems inefficient, so I was wondering if there was a better way?
Upvotes: 1
Views: 783
Reputation: 8631
You need to pass format='%Y-%m-%d %H:%M:%S.%f'
.
pd.to_datetime(df["serviceStartTimestamp"]).dt.strftime('%Y-%m-%d %H:%M:%S.%f')
Output:
0 2019-08-07 08:04:43.942
1 2019-08-07 08:13:48.397
2 2019-08-07 09:02:29.067
Upvotes: 1