Reputation: 4928
Here is MRE:
df = pd.DataFrame({
"price":[44,21,33,44,55,66,454, 22],
"store":["A", "A", "B", "A","A", "A", "A", "A"]
})
which gives:
price store
0 44 A
1 21 A
2 33 B
3 44 A
4 55 A
5 66 A
6 454 A
7 22 A
Whenever store = "A" I want to add up the price until another store= "B" is reached. Then again whenever I get store="A" I sum prices and so on...
So in this case I would have 2 cycles of store = "A" and I would get
cycle 1: 44 + 21
cycle 2: 44 + 55 + 66 + 454 + 22.
It would be better to get index of cycle 1 ([0,1] and cycle 2[3,4,5,6,7] but only getting the sum will suffice.
What I've tried:
start_value = "A"
d = {}
lst = []
for ind, row in df.iterrows():
if row["store"] != start_value:
d[ind] = lst
print(d)
lst = []
else:
lst.append(row["price"])
d["last"] = lst
This outputs what I've desired {"2":[44, 21], "last":[44, 33, 66, 454, 22]}
. But any alternative ways or faults in my method?
Upvotes: 0
Views: 48
Reputation: 71580
Try this:
print({k:v for k, v in df.groupby(df['store'].ne(df['store'].shift()).cumsum())['price'].agg(list).to_dict().items() if len(v) > 1})
Output:
{1: [44, 21], 3: [44, 33, 66, 454, 22]}
Upvotes: 0
Reputation: 29742
IIUC, using pandas.Groupby
with cumsum
:
df['sum'] = df.groupby(df['store'].ne(df['store'].shift()).cumsum()).cumsum()
print(df)
Output:
price store sum
0 44 A 44
1 21 A 65
2 33 B 33
3 44 A 44
4 55 A 99
5 66 A 165
6 454 A 619
7 22 A 641
Upvotes: 2