Reputation: 396
Consider the following piece of code:
>>> data = pandas.DataFrame({ 'user': [1, 5, 3, 10], 'week': [1, 1, 3, 4], 'value1': [5, 4, 3, 2], 'value2': [1, 1, 1, 2] })
>>> data = data.pivot_table(index='user', columns='week', fill_value=0)
>>> data['target'] = [True, True, False, True]
>>> data
value1 value2 target
week 1 3 4 1 3 4
user
1 5 0 0 1 0 0 True
3 0 3 0 0 1 0 True
5 4 0 0 1 0 0 False
10 0 0 2 0 0 2 True
Now if I call this:
>>> 'target' in data.columns
True
It returns True
as expected. However, why does this return True
as well?
>>> 'target' in data.drop('target', axis=1).columns
True
How can I drop a column from the table so it's no longer in the index and the above statement returns False
?
Upvotes: 5
Views: 428
Reputation: 541
I propose @Jeff's comment as a new Answer.
data = data.drop('target', axis=1)
data.columns = data.columns.remove_unused_levels()
Upvotes: 0
Reputation: 32105
As of now (pandas 0.19.2), a multiindex will retain all the ever used labels in its structure. Dropping a column doesn't remove its label from the multiindex and it is still referenced in it. See long GH item here.
Thus, you have to workaround the issue and make assumptions. If you are sure the labels you're checking are on a specific index level (level 0 in your example), then one way is to do this:
'target' in data.drop('target', axis=1).columns.get_level_values(0)
Out[145]: False
If it can be any level, you can use get_values()
and lookup on the entire list:
import itertools as it
list(it.chain.from_iterable(data.drop('target', axis=1).columns.get_values()))
Out[150]: ['value1', 1, 'value1', 3, 'value1', 4, 'value2', 1, 'value2', 3, 'value2', 4]
Upvotes: 4