Harrison
Harrison

Reputation: 5376

Create dataframe from specific column

I am trying to create a dataframe in Pandas from the AB column in my csv file. (AB is the 27th column).

I am using this line:

df = pd.read_csv(filename, error_bad_lines = False, usecols = [27])

... which is resulting in this error:

ValueError: Usecols do not match names.

I'm very new to Pandas, could someone point out what i'm doing wrong to me?

Upvotes: 2

Views: 307

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

Here is a small demo:

CSV file (without header, i.e. there is NO column names):

1,2,3,4,5,6,7,8,9,10
11,12,13,14,15,16,17,18,19,20

We are going to read only 8-th column:

In [1]: fn = r'D:\temp\.data\1.csv'

In [2]: df = pd.read_csv(fn, header=None, usecols=[7], names=['col8'])

In [3]: df
Out[3]:
   col8
0     8
1    18

PS pay attention at header=None, usecols=[7], names=['col8']

If you don't use header=None and names parameters, the first row will be used as a header:

In [6]: df = pd.read_csv(fn, usecols=[7])

In [7]: df
Out[7]:
    8
0  18

In [8]: df.columns
Out[8]: Index(['8'], dtype='object')

and if we want to read only the last 10-th column:

In [9]: df = pd.read_csv(fn, usecols=[10])
... skipped ...
ValueError: Usecols do not match names.

because pandas counts columns starting from 0, so we have to do it this way:

In [12]: df = pd.read_csv(fn, usecols=[9], names=['col10'])

In [13]: df
Out[13]:
   col10
0     10
1     20

Upvotes: 2

DENDULURI CHAITANYA
DENDULURI CHAITANYA

Reputation: 309

usecols uses the column name in your csv file rather than the column number. in your case it should be usecols=['AB'] rather than usecols=[28] that is the reason of your error stating usecols do not match names.

Upvotes: -1

Related Questions