Reputation: 536
I'm trying to read a csv file with data like this:
data.csv:
time,value
12/31/2015 7:00:00 PM,275.9271
12/31/2015 7:00:02 PM,278.559
12/31/2015 7:00:10 PM,277.5949
12/31/2015 7:00:12 PM,270.0564
My problem is that there is a lot of data and if I try to read the file while telling the reader to parse the time column like:
df = pd.read_csv('data.csv', parse_dates = ['time'])
It is very slow. If on the other hand I read the data, and then try to convert the time column, it is also slow:
df['time'] = pd.to_datetime(df.time)
After looking around the web, the solution seems to be to specify the format of the time, but this fails because the months and hours are missing zero-padding:
pd.to_datetime(df.time,format='%b/%d/%Y %H:%M:%S %p')
ValueError: time data '10/26/2016 7:00:01 PM' does not match format '%b/%d/%Y %H:%M:%S %p' (match)
Is there a way to make a format string to match non-standard data? Or else to speed up one of the auto-recognition methods? Thank you!
Upvotes: 2
Views: 320
Reputation: 3382
I think the %b
should be %m
?
%b
Month as locale’s abbreviated name.
%m
Month as a zero-padded decimal number.
>>> from datetime import datetime
>>> d = datetime.strptime('12/31/2015 7:00:00 PM', '%b/%d/%Y %H:%M:%S %p')
ValueError: time data '12/31/2015 7:00:00 PM' does not match format '%b/%d/%Y %H:%M:%S %p'
>>> d = datetime.strptime('12/31/2015 7:00:00 PM', '%m/%d/%Y %I:%M:%S %p')
>>> d
datetime.datetime(2015, 12, 31, 19, 0)
I find myself going back to this site frequently: http://strftime.org/
Or just the Python docs: https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
Example
import pandas as pd
df = pd.DataFrame({'time': {0: '12/31/2015 7:00:00 PM',
1: '12/31/2015 7:00:02 PM',
2: '12/31/2015 7:00:10 PM',
3: '12/31/2015 7:00:12 PM'},
'value': {0: 275.9271, 1: 278.559, 2: 277.5949, 3: 270.0564}})
df['time'] = pd.to_datetime(df['time'], format='%m/%d/%Y %I:%M:%S %p')
# time value
# 0 2015-12-31 19:00:00 275.9271
# 1 2015-12-31 19:00:02 278.5590
# 2 2015-12-31 19:00:10 277.5949
# 3 2015-12-31 19:00:12 270.0564
Upvotes: 2