Reputation: 3020
I have data frame df.
df.columns gives this output
Index([u'Talk Time\t', u'Hold Time\t', u'Work Time\t', u'Call Type'], dtype='object')
Here, column 'Talk Time' has "\t" character with it, so if I do the following, I get an error
df['Talk Time']
Traceback (most recent call last):
File "<ipython-input-78-f2b7b9f43f59>", line 1, in <module>
old['Talk Time']
File "C:\Users\Admin\Anaconda\lib\site-packages\pandas\core\frame.py", line 1780, in __getitem__
return self._getitem_column(key)
File "C:\Users\Admin\Anaconda\lib\site-packages\pandas\core\frame.py", line 1787, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\Admin\Anaconda\lib\site-packages\pandas\core\generic.py", line 1068, in _get_item_cache
values = self._data.get(item)
File "C:\Users\Admin\Anaconda\lib\site-packages\pandas\core\internals.py", line 2849, in get
loc = self.items.get_loc(item)
File "C:\Users\Admin\Anaconda\lib\site-packages\pandas\core\index.py", line 1402, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas\index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas\index.c:3820)
File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:3700)
File "pandas\hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12323)
File "pandas\hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12274)
KeyError: 'Talk Time'
So I modify columns to remove tab characters as follows:
for n in range(len(df.columns)):
df.columns.values[n] = df.columns.values[n].rstrip()
Tab characters get removed, df.columns give the following output
Index([u'Talk Time', u'Hold Time', u'Work Time', u'Call Type'], dtype='object')
But, still when I am trying to access a column as
df['Talk Time']
, I am seeing the same error. Why is it happening?
Upvotes: 1
Views: 1473
Reputation: 5797
The main issue is, that you replaced the value
of the columns
and that is you actually managed to do. But that is just an alias, thus the actual name stayed as was before. So df['Talk Time\t']
worked well on, if you tried to, but obviously that wasn't the result you waited for.
So the solution is that you have to change the df.columns
instead of df.columns.value
df.columns = [c.rstrip() for c in df.columns]
This is what works fine according to your needs
Upvotes: 1
Reputation: 21873
I can't reproduce your second error, however, you could do:
df.columns = [i.rstrip() for i in df.columns]
Maybe this will help !
Upvotes: 0