Reputation: 409
I have the following problem :
A dataframe named df1 like this :
Id PVF PM_year Year
0 A6489 75 25 2018
1 A175 56 54 2018
2 A2856 34 65 2018
3 A6489 35 150 2019
4 A175 45 700 2019
5 A2856 55 120 2019
6 A6489 205 100 2020
7 A2856 35 445 2020
I want to create a new column named PM_previous_year which is equal for each combination (ID+Year) to the value of PM_year of the same Id and the previous year...
Example :
For the line indexed 3, the Id is 'A6489' and the year is 2019. So the value of the new column "PM_previous_year" should be the value of the line where Id is the same ('A6489') and year is equal to 2018 (2019-1). in this simple example it corresponds to the line indexed 0 and the value expected for the new column on line indexed 3 is then 25.
Finally, the targeted DataFrame df2 for this short example looks like this :
Id PVF PM_year Year PM_previous_year
0 A6489 75 25 2018 NaN
1 A175 56 54 2018 NaN
2 A2856 34 65 2018 NaN
3 A6489 35 150 2019 25.0
4 A175 45 700 2019 54.0
5 A2856 55 120 2019 65.0
6 A6489 205 100 2020 150.0
7 A2856 35 445 2020 120.0
I havn't found any obvious solution yet. Maybe there is a way in reshaping the df, but I'm not very familiar with that. If somebody have any idea, I would be very grateful. Thks
Upvotes: 0
Views: 19
Reputation: 863611
If possible simplify solution and shifting PM_year
per Id
use:
df['PM_previous_year'] = df.groupby('Id')['PM_year'].shift()
print (df)
Id PVF PM_year Year PM_previous_year
0 A6489 75 25 2018 NaN
1 A175 56 54 2018 NaN
2 A2856 34 65 2018 NaN
3 A6489 35 150 2019 25.0
4 A175 45 700 2019 54.0
5 A2856 55 120 2019 65.0
6 A6489 205 100 2020 150.0
7 A2856 35 445 2020 120.0
Or:
s = df.pivot('Year','Id','PM_year').shift().unstack().rename('PM_previous_year')
df = df.join(s, on=['Id','Year'])
print (df)
Id PVF PM_year Year PM_previous_year
0 A6489 75 25 2018 NaN
1 A175 56 54 2018 NaN
2 A2856 34 65 2018 NaN
3 A6489 35 150 2019 25.0
4 A175 45 700 2019 54.0
5 A2856 55 120 2019 65.0
6 A6489 205 100 2020 150.0
7 A2856 35 445 2020 120.0
Upvotes: 3