wazo
wazo

Reputation: 311

Pandas datetime formats

I'm struggling with pandas datetime formats. My dataset is as follows (dates are as type object):

+--------+------------+----------+---------------------------+---------------------+
|        | event_id_x | payback  | event_starts_utc_datetime |      dtScraped      |
+--------+------------+----------+---------------------------+---------------------+
|  80325 | 1004179030 | 0.980840 | 2017-09-13 20:45:03.888   | 2017-09-06 17:06:32 |
| 104592 | 1004179030 | 0.980840 | 2017-09-13 20:45:03.888   | 2017-09-06 19:23:56 |
| 261304 | 1004179030 | 0.980840 | 2017-09-13 20:45:03.888   | 2017-09-07 06:21:47 |
| 657433 | 1004179030 | 0.980840 | 2017-09-13 20:45:03.888   | 2017-09-08 13:06:05 |
| 661013 | 1004179030 | 0.979975 | 2017-09-13 20:45:03.888   | 2017-09-11 09:04:15 |
+--------+------------+----------+---------------------------+---------------------+

I wanted to pass event_starts_utc_datetime and dtScraped to datetime, however the following returns "ValueError: time data 'event_starts_utc_datetime' doesn't match format specified"

pinny_payback["event_starts_utc_datetime"] = pd.to_datetime(["event_starts_utc_datetime"], format='%Y-%m-%d %H:%M:%S.%f')

Could you please assist on this?

Upvotes: 1

Views: 712

Answers (1)

Evan
Evan

Reputation: 2151

Here is some code to create your dummy df. I modified it to use commas and pd.read_clipboard.

"""
id,event_id_x,payback,event_starts_utc_datetime,dtScraped
80325,1004179030,0.980840,2017-09-13 20:45:03.888,2017-09-06 17:06:32
104592,1004179030,0.980840,2017-09-13 20:45:03.888,2017-09-06 19:23:56
261304,1004179030,0.980840,2017-09-13 20:45:03.888,2017-09-07 06:21:47
657433,1004179030,0.980840,2017-09-13 20:45:03.888,2017-09-08 13:06:05
661013,1004179030,0.979975,2017-09-13 20:45:03.888,2017-09-11 09:04:15
"""

import pandas as pd
df = pd.read_clipboard(sep = ',')

df['event_starts_utc_datetime'] = pd.to_datetime(df['event_starts_utc_datetime'], format='%Y-%m-%d %H:%M:%S.%f')

print(df.dtypes)

df

I suspect your error may have just been not including the dataframe name within your pd.to_datetime() statement...

Output:

id                                    int64
event_id_x                            int64
payback                             float64
event_starts_utc_datetime    datetime64[ns]
dtScraped                            object
dtype: object

Upvotes: 2

Related Questions