Reputation: 487
Please consider a panda dataframe df created with pd.read_csv() function.
df has 2 columns with type below:
df.dtypes:
DAT_RUN object
DAT_FORECAST object
Some values:
DAT_RUN DAT_FORECAST
0 2020-08-11 00:00:00.000 2020-08-11 00:00:00.000
1 2020-08-11 00:00:00.000 2020-08-11 01:00:00.000
2 2020-08-11 00:00:00.000 2020-08-11 02:00:00.000
3 2020-08-11 00:00:00.000 2020-08-11 03:00:00.000
4 2020-08-11 00:00:00.000 2020-08-11 04:00:00.000
... ... ...
As you can see, columns values are in date format.
I want to convert theses columns to datetime (NOT IN PLACE):
pd.to_datetime(df['DAT_RUN'], format='%Y-%m-%d %H:%M:%S')
I lost time information in value.
whereas
pd.to_datetime(df['DAT_FORECAST'], format='%Y-%m-%d %H:%M:%S')
keeps time information in value.
Why?
For example in csv file I have:
df.loc[df['DAT_RUN'] == "2020-08-10 03:00:00.000", "DAT_RUN"]
returns 0 rows:
Series([], Name: DAT_RUN, dtype: object)
Whereas
df.loc[df['DAT_FORECAST'] == "2021-06-11 06:00:00.000", "DAT_FORECAST"]
returns rows:
What are the differences?
Upvotes: 0
Views: 38
Reputation: 862791
If there is time with 00:00:00.000
pandas only not display it.
df['DAT_RUN'] = pd.to_datetime(df['DAT_RUN'])
df['DAT_FORECAST'] = pd.to_datetime(df['DAT_FORECAST'])
Check it:
print (df['DAT_RUN'].head().tolist())
For second issue seems are compared object
columns instead datetime
s, maybe not assigned back columns.
Upvotes: 1