Reputation: 3327
I need help converting into python/pandas date time format. For example, my times are saved like the following line:
2017-01-01 05:30:24.468911+00:00
.....
2017-05-05 01:51:31.351718+00:00
and I want to know the simplest way to convert this into date time format for essentially performing operations with time (like what is the range in days of my dataset to split up my dataset into chunks by time, what's the time difference from one time to another)? I don't mind losing some of the significance for the times if that makes things easier. Thank you so much!
Upvotes: 3
Views: 3178
Reputation: 2544
Since the Pandas tag is there:
df = pd.DataFrame(['2017-01-01 05:30:24.468911+00:00'])
df.columns = ['Datetime']
df['Datetime'] = pd.to_datetime(df['Datetime'], format='%Y-%m-%d %H:%M:%S.%f', utc=True)
print(df.dtypes)
Upvotes: 1
Reputation: 109520
Timestamp
will convert it for you.
>>> pd.Timestamp('2017-01-01 05:30:24.468911+00:00')
Timestamp('2017-01-01 05:30:24.468911+0000', tz='UTC')
Let's say you have a dataframe that includes your timestamp column (let's call it stamp
). You can use apply
on that column together with Timestamp
:
df = pd.DataFrame(
{'stamp': ['2017-01-01 05:30:24.468911+00:00',
'2017-05-05 01:51:31.351718+00:00']})
>>> df
stamp
0 2017-01-01 05:30:24.468911+00:00
1 2017-05-05 01:51:31.351718+00:00
>>> df['stamp'].apply(pd.Timestamp)
0 2017-01-01 05:30:24.468911+00:00
1 2017-05-05 01:51:31.351718+00:00
Name: stamp, dtype: datetime64[ns, UTC]
You could also use Timeseries
:
>>> pd.TimeSeries(df.stamp)
0 2017-01-01 05:30:24.468911+00:00
1 2017-05-05 01:51:31.351718+00:00
Name: stamp, dtype: object
Once you have a Timestamp object, it is pretty efficient to manipulate. You can just difference their values, for example.
You may also want to have a look at this SO answer which discusses timezone unaware values to aware.
Upvotes: 2
Reputation: 1297
Let's say I have two strings 2017-06-06
and 1944-06-06
and I wanted to get the difference (what Python calls a timedelta
) between the two.
First, I'll need to import datetime
. Then I'll need to get both of those strings into datetime
objects:
>>> a = datetime.datetime.strptime('2017-06-06', '%Y-%m-%d')
>>> b = datetime.datetime.strptime('1944-06-06', '%Y-%m-%d')
That will give us two datetime objects that can be used in arithmetic functions that will return a timedelta
object:
>>> c = abs((a-b).days)
This will give us 26663
, and days
is the largest resolution that timedelta
supports: documentation
Upvotes: 2