Jake Walther
Jake Walther

Reputation: 89

iterate over specific column up to a certain value in pandas

A = pd.DataFrame({"type":['a','b','c', 'd','e'], "cost basis":[50, 40, 30, 20, 10], "value":[5, 25, 40, 10, 20]})

I am looking to iterate over the "value" column up to a certain value or sum in descending order. Let's say 50, whereas if the next number exceeds that value then the iteration would stop there.

Upvotes: 0

Views: 611

Answers (2)

Tarik
Tarik

Reputation: 11209

You can achieve this using two functions: cumsum and argmax

import numpy as np
import pandas as pd
A = pd.DataFrame({"type":['a','b','c', 'd','e'], "cost basis":[50, 40, 30, 20, 10], "value":[5, 25, 40, 10, 20]})

# Cummulated sum of array A
acumsum = np.cumsum(A.value.values)

# Determine the first index where the value is greater than 50:
idx = np.argmax(acumsum > 50)

print(idx)

Upvotes: 0

Anurag Dabas
Anurag Dabas

Reputation: 24314

Not sure that what you want but If I Understand correctly:

Try via cumsum():

out=A.loc[A['value'].cumsum().le(50)]

OR

If want in descending order then use sort_values()+cumsum():

out=A.loc[A.sort_values('value',ascending=False,ignore_index=True)['value'].cumsum().le(50)]

Upvotes: 1

Related Questions