Reputation: 69
I am trying to create a new list using data from a pandas Dataframe. The Dataframe in question has a column of Dates as well as a column for Units Sold as seen below:
Peep = Xsku[['new_date', 'cum_sum']]
Peep.head(15)
Out[159]:
new_date cum_sum
18 2011-01-17 214
1173 2011-01-24 343
2328 2011-01-31 407 #Save Entry in List
3483 2011-02-07 71
4638 2011-02-14 159
5793 2011-02-21 294
6948 2011-02-28 425 #Save Entry in List
8103 2011-03-07 113
9258 2011-03-14 249
10413 2011-03-21 347
11568 2011-03-28 463 #Save Entry in List
12723 2011-04-04 99
13878 2011-04-11 186
15033 2011-04-18 291
16188 2011-04-25 385
I am trying to make a new list, where the list contains the maximum 'cum_sum' before the number is reset (i.e. becomes smaller). For example, in the first four entries above, the cum_sum reaches 407 and then goes back down to 71. I am thus trying to save the number 407 as well as the corresponding 'new_date' (2011-01-31 in this example) and do this for every entry.
My final List will thus have all the maximum 'cum_sum' values before it is reset. For example it will look like as follows: (First Three Expected Values)
MyList
Out[]:
new_date cum_sum
2011-01-31 407
2011-02-28 425
2011-03-28 463
...
I have been trying to do something as a for loop, but continually run into problems:
MyList= [] ##My Empty List
for i in range(len(Peep['new_date'])):
if Peep.iloc[i,1] > Peep.iloc[i + 1,1]:
MyList.append(Peep.iloc[i,1])
Can anyone help me in this regard?
Upvotes: 0
Views: 65
Reputation: 77027
Use .diff
and filter like
In [17]: df[df['cum_sum'].diff(-1).ge(0)]
Out[17]:
new_date cum_sum
2 2011-01-31 407
6 2011-02-28 425
10 2011-03-28 463
Upvotes: 2