Reputation: 1
I have a pandas MultiIndexed pandas dataframe. I would like to find the maximum value of one of the (numerical, integer) indices.
That is, the index runs from 1 to 5844. I want to be able to find the scalar value 5844.
I realize that I could just set the scalar variable as I know the values which the index takes, but I'd like to be able to find the maximum value in the case when I don't know it.
Upvotes: 0
Views: 600
Reputation: 4248
A possible solution is to use the .max()
method on the index. In this case, it will return the values in each level of the MultiIndex, which may OR may not be what you want. Also of note, .max()
will return values lexigraphically, meaning that for each level of the hierarchy, it will find the lexigraphically highest value in the level and then look for the next highest values at the next level for that first group.
>>> tuples = [('bar', 1),
('bar', 10),
('baz', 11),
('baz', 14),
('foo', 15),
('foo', 16),
('qux', 17),
('qux', 5844)]
>>> index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
>>> index.max()
('qux', 5844)
In this case, qux
was lexigraphically highest and within the next tier of the MultiIndex (17
and 5844
), 5844
was the highest value in the qux
grouping.
If you need to fine-tune your approach, you can select for a specific level of the MultiIndex in the following way. In this case, since the integers are in the level identified by index 1
, we can use this approach:
>>> index.levels[1].max()
5844
If your integers are in a different level, you simply change the index in the levels bracket.
Upvotes: 1
Reputation: 149185
You could convert the multi index to a frame and then get the max of a dataframe column:
scalar = df.index.to_frame[i].max()
But the simplest way is probably to get the max of the approriate level:
scalar = df.index.levels[i].max()
Upvotes: 0
Reputation: 534
If you know that indices go from 1 to 5844 with none missing, df.shape[0]
works.
Upvotes: 0