Reputation: 799
My csv is as follows (MQM Q.csv):
Date-Time,Value,Grade,Approval,Interpolation Code
31/08/2012 12:15:00,,41,1,1
31/08/2012 12:30:00,,41,1,1
31/08/2012 12:45:00,,41,1,1
31/08/2012 13:00:00,,41,1,1
31/08/2012 13:15:00,,41,1,1
31/08/2012 13:30:00,,41,1,1
31/08/2012 13:45:00,,41,1,1
31/08/2012 14:00:00,,41,1,1
31/08/2012 14:15:00,,41,1,1
The first few lines have no "Value" entries but they start later on.
Here is my code:
import pandas as pd
from StringIO import StringIO
Q = pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)
I get the following error:
Traceback (most recent call last):
File "daily.py", line 4, in <module>
Q = pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 443, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 228, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 533, in __init__
self._make_engine(self.engine)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 670, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 1067, in __init__
col_indices.append(self.names.index(u))
ValueError: 'Value' is not in list
Upvotes: 8
Views: 47889
Reputation: 40688
The following works for me (I have the CSV file in the same directory as the script, but that should not matter). I am running the following script on my Mac, not Cygwin, but it should work the same way:
import pandas as pd
Q = pd.read_csv("MQM Q.csv",
header=0,
parse_dates=True,
dayfirst=True,
index_col=0,
usecols=["Date-Time", "Value"])
print Q
Upvotes: 0
Reputation: 393963
This appears to be a bug with the csv parser, firstly this works:
df = pd.read_csv('MQM Q.csv')
also this works:
df = pd.read_csv('MQM Q.csv', usecols=['Value'])
but if I want Date-Time
then it fails with the same error message as yours.
So I noticed it was utf-8 encoded and so I converted using notepad++ to ANSI and it worked, I then tried utf-8 without BOM and it also worked.
I then converted it to utf-8 (presumably there is now a BOM) and it failed with the same error as before, so I don't think you are imaging this now and this looks like a bug.
I am using python 3.3, pandas 0.14 and numpy 1.8.1
To get around this do this:
df = pd.read_csv('MQM Q.csv', usecols=[0,1], parse_dates=True, dayfirst=True, index_col=0)
This will set your index to the Date-Time column which will correctly convert to a datetimeindex.
In [40]:
df.index
Out[40]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-08-31 12:15:00, ..., 2013-11-28 10:45:00]
Length: 43577, Freq: None, Timezone: None
Upvotes: 5
Reputation: 375415
Your code should read (no need from StringIO on the filename!):
import pandas as pd
Q = pd.read_csv("/cygdrive/c/temp/MQM Q.csv"), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)
Otherwise/currently pandas is trying to read the string (of the path) in as a DataFrame:
In [11]: pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""))
Out[11]:
Empty DataFrame
Columns: [/cygdrive/c/temp/MQM Q.csv]
Index: []
which obviously isn't what you want (hence you see the Value is not a column exception).
Upvotes: 0