Reputation: 1202
I have a csv file which has dates in multiple formats like this:
Date X1 X2
12/6/2017 23:00 928.88 3.19
12/6/2017 23:20 928.86 3.37
12/6/2017 23:40 930.26 3.38
13-06-17 0:00 930.37 3.41
13-06-17 0:20 930.39 3.49
13-06-17 0:40 930.15 3.54
13-06-17 1:00 930.36 3.46
I wanted to parse the dates but the format is different:
I tried:
date_formats = ["%d/%m/%Y %H:%M","%d-%m-%Y %H:%M"]
for x in date_formats:
try:
dateparse = lambda dates: datetime.strptime(dates, x)
except ValueError:
dateparse = lambda dates: datetime.strptime(dates, x)
df2 = read_csv("Copy.csv", parse_dates=True,
index_col="Time", date_parser=dateparse)
But i am getting the format errors.
ValueError: time data '5/6/2017 0:00' does not match format '%d-%m-%Y %H:%M'
Is there any other way to parse different date formats of csv files? Any help would be appreciated.
Upvotes: 3
Views: 2172
Reputation: 394469
the built in dateparser in pandas
is man/woman enough to handle this already, so just pass param parse_dates=[0]
to tell read_csv
to parse the first column as datetimes, additionally you need to pass dayfirst=True
:
In[19]:
import pandas as pd
import io
t="""Date,X1,X2
12/6/2017 23:00,28.88,3.19
12/6/2017 23:20,928.86,3.37
12/6/2017 23:40,930.26,3.38
13-06-17 0:00,930.37,3.41
13-06-17 0:20,930.39,3.49
13-06-17 0:40,930.15,3.54
13-06-17 1:00,930.36,3.46"""
df = pd.read_csv(io.StringIO(t), parse_dates=['Date'], dayfirst=True)
df
Out[19]:
Date X1 X2
0 2017-06-12 23:00:00 28.88 3.19
1 2017-06-12 23:20:00 928.86 3.37
2 2017-06-12 23:40:00 930.26 3.38
3 2017-06-13 00:00:00 930.37 3.41
4 2017-06-13 00:20:00 930.39 3.49
5 2017-06-13 00:40:00 930.15 3.54
6 2017-06-13 01:00:00 930.36 3.46
Upvotes: 3