quant
quant

Reputation: 4492

How to get the min and the max index where a pandas column has the same value

I have the following pandas dataframe

foo = pd.DataFrame({'step': [1,2,3,4,5,6,7,8], 'val': [1,1,1,0,0,1,0,1]})

I would like to get the 1st and last step for each of the sequence of 1s in the val column. Explanation:

So the output is the list [1,3,6,6,8,8]

Any ideas how to do that ?

Upvotes: 1

Views: 57

Answers (2)

G.G
G.G

Reputation: 765

foo.assign(col1=foo.val.diff().eq(1).cumsum()).query("val==1")\
    .groupby(['col1'],as_index=False).step.agg(['first','last'])\
    .stack().tolist()

output: [1, 3, 6, 6, 8, 8]

Upvotes: 0

mozway
mozway

Reputation: 261860

IIUC, you can use a groupby aggregation, flatten using numpy and convert to list:

# compute groups of consecutive numbers
group = foo['val'].ne(foo['val'].shift()).cumsum()

out = (foo
 .loc[foo['val'].eq(1), 'step']         # keep step only where vale is 1
 .groupby(group).agg(['first', 'last']) # get first and last
 .to_numpy().ravel().tolist()           # reshape
)

output: [1, 3, 6, 6, 8, 8]

Upvotes: 2

Related Questions