Reputation: 353
I have the following df in pandas:
person year A B
AA 1998 5
AA 1999 10
AA 2000 15
XB 2010 100
CY 1980 3
CY 1981 9
CY 1982 36
CY 1983 72
MJ 2017 120
MJ 2018 240
I'd like to iterate over each person in the df, in order by year, and compare their column A values by, for example, percentage difference and place that value in column B. For example, 1999 AA - the value is 200% bigger than the 1998 AA column A value. The 2000 AA value would be 150% and so on.
The first year that a person appears the column B value would be zero - for example, AA 1998 - the column B value would be zero as there is no previous year to compare it to.
I am wondering how to initiate an iteration to specifically look at the person, and then the year, and compare that year's column A value to the prior year's value. This would repeat until reaching a new person.
Upvotes: 1
Views: 52
Reputation: 2579
dfshift = df.groupby('person')['A'].transform(lambda x: x.shift())
df['B'] = (df['A']/dfshift)*100
df['B'].fillna(0, inplace = True)
person year A B
0 AA 1998 5 0.0
1 AA 1999 10 200.0
2 AA 2000 15 150.0
3 XB 2010 100 0.0
4 CY 1980 3 0.0
5 CY 1981 9 300.0
6 CY 1982 36 400.0
7 CY 1983 72 200.0
8 MJ 2017 120 0.0
9 MJ 2018 240 200.0
This should do what you mean. Just do a minor transform on A to get the values you want to compare, then do the operation you want to do. Be warned though that your year values will have to be correctly sorted for this to work.
Upvotes: 3