Reputation: 105
I have been looking for this answer in the community so far, could not have.
I have a dataframe in python 3.5.1 that contains a column with dates in string imported from a CSV file.
The dataframe looks like this
TimeStamp TBD TBD Value TBD
0 2016/06/08 17:19:53 NaN NaN 0.062942 NaN
1 2016/06/08 17:19:54 NaN NaN 0.062942 NaN
2 2016/06/08 17:19:54 NaN NaN 0.062942 NaN
what I need is to change the TimeStamp column format to be %m/%d/%y %H:%M:%D
TimeStamp TBD TBD Value TBD
0 06/08/2016 17:19:53 NaN NaN 0.062942 NaN
So far I have found some solutions that works but for string and not for series
Any help would be appreciated
Thanks
Upvotes: 2
Views: 12982
Reputation: 1982
For most common date and datetime formats, pandas .to_datetime
function can parse them without we providing format.
For example:
df.TimeStamp.apply(lambda x: pd.to_datetime(x))
And in the example given from the question,
df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
will give us the same result.
Using .apply
will be efficient if you have multiple columns.
Of course, providing the parsing format is necessary for many situations. For a full list of formats, please see https://docs.python.org/3/library/datetime.html.
Upvotes: 0
Reputation: 879083
If you convert the column of strings to a time series, you could use the dt.strftime
method:
import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
print(df)
yields
TBD TBD.1 TBD.2 TimeStamp Value
0 NaN NaN NaN 06/08/2016 17:19:53 0.062942
1 NaN NaN NaN 06/08/2016 17:19:54 0.062942
2 NaN NaN NaN 06/08/2016 17:19:54 0.062942
Since you want to convert a column of strings to another (different) column of strings, you could also use the vectorized str.replace
method:
import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
print(df)
since
In [32]: df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
Out[32]:
0 06/08/2016 17:19:53
1 06/08/2016 17:19:54
2 06/08/2016 17:19:54
Name: TimeStamp, dtype: object
This uses regex to rearrange pieces of the string without first parsing the string as a date. This is faster than the first method (mainly because it skips the parsing step), but it also has the disadvantage of not checking that the date strings are valid dates.
Upvotes: 6