Reputation: 1441
Does anyone know if it is possible to use the DataFrame.loc
method to select from a MultiIndex
? I have the following DataFrame
and would like to be able to access the values located in the Dwell
columns, at the indices of ('at', 1)
, ('at', 3)
, ('at', 5)
, and so on (non-sequential).
I'd love to be able to do something like data.loc[['at',[1,3,5]], 'Dwell']
, similar to the data.loc[[1,3,5], 'Dwell']
syntax for a regular index (which returns a 3-member series of Dwell
values).
My purpose is to select an arbitrary subset of the data, perform some analysis only on that subset, and then update the new values with the results of the analysis. I plan on using the same syntax to set new values for these data, so chaining selectors wouldn't really work in this case.
Here is a slice of the DataFrame
I'm working with:
Char Dwell Flight ND_Offset Offset
QGram
at 0 a 100 120 0.000000 0
1 t 180 0 0.108363 5
2 a 100 120 0.000000 0
3 t 180 0 0.108363 5
4 a 20 180 0.000000 0
5 t 80 120 0.108363 5
6 a 20 180 0.000000 0
7 t 80 120 0.108363 5
8 a 20 180 0.000000 0
9 t 80 120 0.108363 5
10 a 120 180 0.000000 0
Upvotes: 82
Views: 123838
Reputation: 6710
Try the cross-section indexing:
In [68]: df.xs('at', level='QGram', drop_level=False).loc[[1,4]]
Out[68]:
Char Dwell Flight ND_Offset Offset
QGram
at 1 t 180 0 0.108363 5
4 a 20 180 0.000000 0
Upvotes: 20
Reputation: 196
loc
method is your best friend with multi-index. However, you must understand how loc works on multi indexes. When using loc on multi indexes you must specify every other index value in the loc such as:
df.loc['indexValue1', 'indexValue2', 'indexValue3']
However, as you may imagine this may be a pain in cases you don't know what all the other values are so we can of course use ':'
df.loc[:, 'value1', 'value2', :]
Hope this helps!
Upvotes: 6
Reputation: 1945
In general, MultiIndex keys take the form of tuples. For example:
In [6]: df.loc[('at', 1),'Dwell']
Out[6]: 180
In your case, you would have to pass a list of tuples. For example, the following works as you would expect:
In [7]: df.loc[ [('at', 1),('at', 3),('at', 5)], 'Dwell']
Out[7]:
Dwell
QGram
at 1 180
at 3 180
at 5 80
Upvotes: 4
Reputation: 52246
If you are on version 0.14, you can simply pass a tuple to .loc
as below:
df.loc[('at', [1,3,4]), 'Dwell']
Upvotes: 80