Reputation: 107
if I read in the csv file using this code :
df = pd.read_csv('amazon2.csv'
, names=["year","state","month","number","date"]
, index_col = ['month']
, usecols=["year","state","number"]
, encoding = "ISO-8859-1")
would raise valueError:
raise ValueError("Index {col} invalid".format(col=col))
ValueError: Index month invalid
But would not raise error if either usecols or index_col is commented out
Thanks in advance!
the database looks like this :
Upvotes: 2
Views: 3350
Reputation: 293
The error source is caused by that the index column name "month" is not included in the columns list :usecols.
df1=pd.read_csv("test.csv",index_col="month",usecols=["year","state","number","date","month"])
Output:
year state number date
month
Janeiro 1998 Acre 0 1998/1/1
Janeiro1 1998 Acre 1 1998/1/1
Janeiro1 1999 Acre2 2 1999/1/1
Janeiro2 2000 Acre 3 2000/1/1
Janeiro2 2000 Acre1 4 2000/1/1
But I agree that there should be no duplicated values in the index col.
Upvotes: 4