Reputation: 45
I am trying to read a csv file using Pandas. But while using the pd.read_csv I get a ValueError: Length mismatch: Expected axis has 7 elements, new values have 5 elements.
Here's the code:
# load train data
data = pd.read_csv('training1.6.csv',error_bad_lines=False , encoding='iso-8859-1',low_memory=False)
data.columns = ['label','id','date','user','text']
data.head(2)
Here's the traceback:
ValueError Traceback (most recent call last)
<ipython-input-5-21e4215846cd> in <module>()
1 data = pd.read_csv('training1.6.csv',error_bad_lines=False , encoding='iso-8859-1')
----> 2 data.columns = ['label','id','date','user','text']
3 data.head(2)
2 frames
pandas/_libs/properties.pyx in pandas._libs.properties.AxisProperty.__set__()
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in set_axis(self, axis, new_labels)
181 raise ValueError(
182 "Length mismatch: Expected axis has {old} elements, new "
--> 183 "values have {new} elements".format(old=old_len, new=new_len)
184 )
185
ValueError: Length mismatch: Expected axis has 7 elements, new values have 5 elements
I tried the dtype and low_memory but to no avail. Can someone help me out?
Upvotes: 1
Views: 13734
Reputation: 14121
(You didn't get that error while using the pd.read_csv()
, but in the next command.)
The data
dataframe (which you constructed from the .csv file) has 7 columns, but in the command
data.columns = ['label','id','date','user','text']
you provided only 5 column labels.
Add missing two, e. g.
data.columns = ['label', 'id', 'date', 'user', 'text', 'col_6', 'col_7']
Upvotes: 1
Reputation: 1
There must be some unavailable values in the column which you want to split in the future. eg. the text have more kinds of value than before. you'd better go back to the dataframe to ch enc if some null or empty in you the column
Upvotes: 0