LearningPath
LearningPath

Reputation: 93

Iterate over dataframe using itertuples() with a condition on column values

I'm trying to select a range in a dataframe everytime x <= 20 and create a new dataframe with a colum which is the sum of the selected rows (negative value include) of that range and keep the last date as index. Thks for some hints and if I'm on the right way This is the output I want and below my approach which is not working.

Date         x             Date         sum
2019-01-01   100           2019-01-05   343
2019-01-02   120   --->    2019-01-10   804
2019-01-03    80           2019-01-15   650
2019-01-04    48           2019-01-20   428
2019-01-05     5           ...
2019-01-06   110           ...
2019-01-07   420
2019-01-08   140
2019-01-09   126
2019-01-10     8
2019-01-11    50
2019-01-12   160   
2019-01-13   280
2019-01-14   148
2019-01-15    12
2019-01-16   190
2019-01-17   120
2019-01-18    80
2019-01-19    48
2019-01-20   -10
...
...

#######
for date in df.index.to_series().dt.date.unique():
   for row in df.itertuples():
      for i in row:
         if i <= 20:
           new_df = pd.DataFrame(columns=df.keys())
           new_df.index.name = 'Date'
           new_df ['sum'] = df.sum(axis = 0)
         continue:
#

Upvotes: 0

Views: 840

Answers (1)

splash58
splash58

Reputation: 26153

Why not use groupby?

df.groupby(df['x'].shift().le(20).cumsum()) \
  .agg(Date=('Date','last'), sum = ('x','sum')).set_index('Date')

            sum
Date           
2019-01-05  353
2019-01-10  804
2019-01-15  650
2019-01-20  428

Upvotes: 2

Related Questions