Reputation: 3779
I have a text file with data with columns like '10:15.3' meaning 10 minutes 15.3 seconds after some canonical event. When I read this with read_csv, I'm getting strings:
>>> df.time.head()
0 08:32.0
1 08:38.0
2 08:39.0
3 08:43.0
4 09:15.0
Name: time, dtype: object
>>> df.time.head()[:1][0]
'08:32.0'
>>>
I feel like I should be able to get seconds easily enough within pandas, either by specifying a conversion in read_csv or (probably better, so I have both) by appending a new column, but I'm not seeing how to do it. I'm pretty sure this is just me being dense.
Can anyone offer a tip to help me get unstuck?
Upvotes: 0
Views: 737
Reputation: 129008
Using 0.15, this is quite easy to do, full docs are here
Your format is not a 'regular' format, but can easily make it one, by pre-pending '00:' to make the hours field.
In [20]: pd.Timedelta('00:' + '10:15.3')
Out[20]: Timedelta('0 days 00:10:15.300000')
In [19]: s
Out[19]:
0 08:32.0
1 08:38.0
2 08:39.0
3 08:43.0
4 09:15.0
Name: 1, dtype: object
Parse it to a timedelta.
In [15]: res = pd.to_timedelta('00:' + s)
In [16]: res
Out[16]:
0 00:08:32
1 00:08:38
2 00:08:39
3 00:08:43
4 00:09:15
Name: 1, dtype: timedelta64[ns]
These are the displayed seconds from a canonical reduced format (e.g. hh:mm:ss in lowest form).
In [17]: res.dt.seconds
Out[17]:
0 32
1 38
2 39
3 43
4 15
dtype: int64
If you need the total seconds represented, do this (this is called frequency conversion):
In [18]: res.astype('timedelta64[s]')
Out[18]:
0 512
1 518
2 519
3 523
4 555
Name: 1, dtype: float64
Upvotes: 1
Reputation: 13965
You can use the datetime.time
object. To get this you provide: hours, minutes, seconds, microseconds. These are provided as integers, so you just need to int cast the relevant part of each string to the datetime.date constructor.
So in your case:
import datetime
df = pd.read_csv('your_csv.csv')
df.time = pd.Series([datetime.time(0, int(val[:2]), int(val[3:5]), int(val[6:])*100000)
for val in df.time], index = df.index)
Upvotes: 1
Reputation: 914
Haven't worked with pandas, but I would extract seconds like this:
from time import strptime
print strptime('09:12.14','%I:%M.%S').tm_sec
Upvotes: 0
Reputation: 355
I can't really help you with pandas internals, but there is an easy pythonic way to get the seconds in numerical format:
timeStr = '08:32.0'
seconds = float(time.split(':')[1]) # Split splits the string at the ':' character making a list of strings. Taking the second item from the list and casting to float should get you what you want.
Upvotes: 0