pandas get integer indices where multiindex changes

Question

I have a very large dataframe with a multiindex. I need to pass one column to C to do an operation quickly. For this operation, I need to know where the multiindex changes values. Since this is a large dataframe, I don't want to iterate over the rows or index within python. A small example:

import numpy as np
import pandas as pd
a = np.array([['bar', 'one', 0, 0],
       ['bar', 'two', 1, 2],
       ['bar', 'one', 2, 4],
       ['bar', 'two', 3, 6],
       ['foo', 'one', 4, 8],
       ['foo', 'two', 5, 10],
       ['bar', 'one', 6, 12],
       ['bar', 'two', 7, 14]], dtype=object)
df = pd.DataFrame(a, columns=['ix0', 'ix1', 'cd0', 'cd1'])
df.sort_values(['ix0', 'ix1'], inplace=True)
df.set_index(['ix0', 'ix1'], inplace=True)

The dataframe looks like this:

In [7]: df
Out[7]: 
        cd0 cd1
ix0 ix1        
bar one   0   0
    one   2   4
    one   6  12
    two   1   2
    two   3   6
    two   7  14
foo one   4   8
    two   5  10

Now I want an array or list that shows where the values in the multiindex change. I.e., the integer index where (bar, one) changes to (bar, two), (bar, two) changes to (foo, one), etc.

To be able to build the hierarchical output, it seems that this data must exist in the index. Is there a way to get to it?

The example output I'm looking for would be: [0, 3, 6, 7].

Thanks

pandas get integer indices where multiindex changes

Answers (1)

Related Questions