potatoCatz
potatoCatz

Reputation: 107

python pandas : Why can't I use both index_col and usecols in the same read_csv statement ? raised valueError

if I read in the csv file using this code :

    df = pd.read_csv('amazon2.csv'
                 , names=["year","state","month","number","date"]
                 , index_col = ['month']
                 , usecols=["year","state","number"]
                 , encoding = "ISO-8859-1")

would raise valueError:

raise ValueError("Index {col} invalid".format(col=col))

ValueError: Index month invalid

But would not raise error if either usecols or index_col is commented out Thanks in advance! the database looks like this : amazon2.csv

Upvotes: 2

Views: 3350

Answers (1)

blueear
blueear

Reputation: 293

The error source is caused by that the index column name "month" is not included in the columns list :usecols.

df1=pd.read_csv("test.csv",index_col="month",usecols=["year","state","number","date","month"])

Output:

          year  state  number      date
month                       
Janeiro   1998   Acre       0  1998/1/1
Janeiro1  1998   Acre       1  1998/1/1
Janeiro1  1999  Acre2       2  1999/1/1
Janeiro2  2000   Acre       3  2000/1/1
Janeiro2  2000  Acre1       4  2000/1/1

But I agree that there should be no duplicated values in the index col.

Upvotes: 4

Related Questions