Reputation: 5823
I cannot find a way to lookup a multiindex in Pandas 0.14. Here is some mock data that I'm having trouble with.
Code:
row1 = ['red', 'ferrari', 'mine']
row2 = ['blue', 'ferrari', 'his']
row3 = ['red', 'lambo', 'his']
row4 = ['yellow', 'porsche', 'his']
row5 = ['yellow', 'lambo', 'his']
all_dat = [row1, row2, row3, row4, row5]
df = DataFrame(all_dat, columns=['Color', 'Make', 'Ownership'])
print df
df = df.set_index(['Color', 'Make'])
print df
print df['red']['lambo']
print df['yellow']['porsche']
Output:
Color Make Ownership
0 red ferrari mine
1 blue ferrari his
2 red lambo his
3 yellow porsche his
4 yellow lambo his
Ownership
Color Make
red ferrari mine
blue ferrari his
red lambo his
yellow porsche his
lambo his
Traceback (most recent call last):
print df['red']['lambo']
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1678, in __getitem__
return self._getitem_column(key)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1685, in _getitem_column
return self._get_item_cache(key)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 1052, in _get_item_cache
values = self._data.get(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 2565, in get
loc = self.items.get_loc(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 1181, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "index.pyx", line 129, in pandas.index.IndexEngine.get_loc (pandas/index.c:3354)
File "index.pyx", line 149, in pandas.index.IndexEngine.get_loc (pandas/index.c:3234)
File "hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:11148)
File "hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:11101)
KeyError: 'red'
I have tried lookups using
df[('red', 'lambo')]
and
df['red', 'lambo']
These had similar results (KeyErrors).
So, is there some kind of step I'm missing here when setting a multiindex? I want to use set_index() as my real data (this is just mock data) has many operations performed on it before it gets to the point where I redefine indices.
Upvotes: 2
Views: 2997
Reputation: 880717
Using df.loc
, you can specify the desired labels as a list of tuples:
In [99]: df.loc[[('red','lambo')]]
Out[99]:
Ownership
Color Make
red lambo his
In [106]: df.loc[[('yellow','porsche'), ('red','lambo')]]
Out[106]:
Ownership
Color Make
yellow porsche his
red lambo his
Assignments can be made like this:
In [117]: df.loc[[('red', 'lambo')], 'Ownership'] = 'mine'
In [118]: df
Out[118]:
Ownership
Color Make
red ferrari mine
blue ferrari his
red lambo mine
yellow porsche his
lambo his
See also: Advanced indexing with hierarchical index
Upvotes: 2