Reputation: 2376
I have a dataframe with the following columns:
['NAME','SNAPDATE','RANK']
The dataframe can contains several occurrences of the same NAME at different dates with each time a different rank. I would like to find the product which rank improved (meaning the value declined)
I tried this:
diffs = df.RANK.diff()
df = df.assign(
change=np.where(
diffs > 0, 'increase', np.where(
diffs < 0, 'decline', '------')))
The problem it is treating the dataframe as a whole and I need to find the trend for a single NAME.
Any idea.
Thanks
Upvotes: 0
Views: 338
Reputation: 562
I think you can use sort_values
together with groupby
method:
df.sort_values(by=['NAME', 'SNAPDATE']).groupby(by='NAME').RANK.diff()
E.g.
df = pd.DataFrame({
'NAME': ['Name1', 'Name1', 'Name2', 'Name2', 'Name1', 'Name2'],
'SNAPDATE': ['2018-04-01', '2018-04-02', '2018-04-03', '2018-04-01', '2018-04-03', '2018-04-02'],
'RANK': [10, 11, 12, 10, 9 ,8]
})
df.sort_values(by=['NAME', 'SNAPDATE']).groupby(by='NAME').RANK.diff()
Output:
0 NaN
1 1.0
2 1.0
3 NaN
4 -1.0
5 -1.0
Name: RANK, dtype: float64
Upvotes: 1