Reputation: 5376
I am trying to create a dataframe in Pandas from the AB
column in my csv file. (AB is the 27th column).
I am using this line:
df = pd.read_csv(filename, error_bad_lines = False, usecols = [27])
... which is resulting in this error:
ValueError: Usecols do not match names.
I'm very new to Pandas, could someone point out what i'm doing wrong to me?
Upvotes: 2
Views: 307
Reputation: 210832
Here is a small demo:
CSV file (without header, i.e. there is NO column names):
1,2,3,4,5,6,7,8,9,10
11,12,13,14,15,16,17,18,19,20
We are going to read only 8-th
column:
In [1]: fn = r'D:\temp\.data\1.csv'
In [2]: df = pd.read_csv(fn, header=None, usecols=[7], names=['col8'])
In [3]: df
Out[3]:
col8
0 8
1 18
PS pay attention at header=None, usecols=[7], names=['col8']
If you don't use header=None
and names
parameters, the first row will be used as a header:
In [6]: df = pd.read_csv(fn, usecols=[7])
In [7]: df
Out[7]:
8
0 18
In [8]: df.columns
Out[8]: Index(['8'], dtype='object')
and if we want to read only the last 10-th
column:
In [9]: df = pd.read_csv(fn, usecols=[10])
... skipped ...
ValueError: Usecols do not match names.
because pandas counts columns starting from 0
, so we have to do it this way:
In [12]: df = pd.read_csv(fn, usecols=[9], names=['col10'])
In [13]: df
Out[13]:
col10
0 10
1 20
Upvotes: 2
Reputation: 309
usecols uses the column name in your csv file rather than the column number. in your case it should be usecols=['AB'] rather than usecols=[28] that is the reason of your error stating usecols do not match names.
Upvotes: -1