Reputation: 55
So I need to read a csv file that contains date and numerical value with a format like this: 2017-01-01 00:00:00,-350.7866656
.
Here is my code:
def parser(x):
return datetime.strptime(x, '%Y-%m-%d %H:%M:%S')
df = read_csv('filename', header=None, parse_dates=[0], index_col=0,
squeeze=True, date_parser=parser)
print(df.head())
What I get is an error: ValueError: unconverted data remains: ,-350.7866656
As I understand program treats string as a single column instead of two and tries to parse whole string instead of first column with date. How do I fix this so it reads csv file into two columns?
Upvotes: 2
Views: 416
Reputation: 382
I looked up the pandas.read_csv documentation. The parameter index_col
which you have set to 0
is setting the date column in the file as the index. In my opinion, removing the parameter would yield the desired result,i.e. reading the CSV in two columns.
Cheers!
Upvotes: 0
Reputation: 862406
In my opinion, the problem is you may have some bad date values, which cannot be parsed by a single datetime
format. Instead, add a try-except
statement to a parser
function:
import pandas as pd
import numpy as np
from datetime import datetime, time
temp=u"""2015-01-04 12:04:10,10
2018-01-10 18:40:12,20
-350.7866656,20"""
def parser(x):
try:
return datetime.strptime(x, '%Y-%m-%d %H:%M:%S')
except ValueError:
return np.nan
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), header=None, parse_dates=[0], index_col=0,
squeeze=True, date_parser=parser)
print (df)
0
2015-01-04 12:04:10 10
2018-01-10 18:40:12 20
NaT 20
Name: 1, dtype: int64
print (df.index)
DatetimeIndex(['2015-01-04 12:04:10', '2018-01-10 18:40:12', 'NaT'], dtype='datetime64[ns]', name=0, freq=None)
Upvotes: 3