Evaluate Row/Index Range of Identical Column Values in Pandas Dataframe

Question

Before going any further, there are a lot of answers relating to the question I am about to ask. As far as I can tell, none directly resolve my issue in a coherent manner.

Here is an example of my data:

Index     API     Measurement
    0     5       1000   
    1     5       2000
    2     5       30.1
    3     10      12
    4     12      0
    5     1       3123.1
    6     1       101.1
    7     100     3
    8     23      0
    9     23      123212.1

Here is what I want to do. I want to - as simple and efficiently as possible - figure out the index RANGE of repeated API values.

Desired output:

API Index_RANGE
5   (0,2)
10  3
12  4
1   (5,6)
100 7
23  (8,9)

How do I go about accomplishing this?

ALollz · Accepted Answer

If you require tuples or a single number, I'd write your own function, and pass It to .agg. Form groups finding where 'API' changes:

def Index_RANGE(x):
    minx = x.index.min()
    maxx = x.index.max()
    if minx != maxx:
        return (minx, maxx)
    else:
        return minx

(df.groupby((df.API != df.API.shift(1)).cumsum()).API.agg(['first', Index_RANGE])
    .set_index('first').rename_axis('API'))

Output

    Index_RANGE
API            
5        (0, 2)
10            3
12            4
1        (5, 6)
100           7
23       (8, 9)

Evaluate Row/Index Range of Identical Column Values in Pandas Dataframe

Answers (2)

Output

Related Questions