Reputation: 313
I have a problem that sounds easy, however, I could not find a solution. I would like to shift values according to the first year of the release. I mean the first column represents the years of the release and the columns are years when the device is broken (values are numbers of broken devices).
See example:
# TESTING of the shifting
df = pd.DataFrame({'Delivery Year' : [1976,1977,1978,1979], "Freq" : [120,100,80,60],
"1976" : [10,float('nan'),float('nan'),float('nan')],
"1977" : [5,3,float('nan'),float('nan')],
"1978" : [10,float('nan'),8,float('nan')],
"1979" : [13,10,5,14]
})
df
Desired Output:
# DESIRED
df = pd.DataFrame({'Delivery Year' : [1976,1977,1978,1979], "Freq" : [120,100,80,60],
"1. Year" : [10,3,8,14],
"2. Year" : [5,float('nan'),5,float('nan')],
"3. Year" : [10,10,float('nan'),float('nan')],
"4. Year" : [13,float('nan'),float('nan'),float('nan')]
})
df
Upvotes: 2
Views: 174
Reputation: 862861
Use custom function with compare columnsnames without Delivery Year, Freq
with Delivery Year
and shifting by this value:
def f(x):
shifted = np.argmin((x.index.astype(int)< x.name[0]))
return x.shift(-shifted)
df = df.set_index(['Delivery Year', 'Freq']).apply(f, axis=1)
df.columns = [f'{i + 1}.Year' for i in range(len(df.columns))]
df = df.reset_index()
print (df)
Delivery Year Freq 1.Year 2.Year 3.Year 4.Year
0 1976 120 10.0 5.0 10.0 13.0
1 1977 100 3.0 NaN 10.0 NaN
2 1978 80 8.0 5.0 NaN NaN
3 1979 60 14.0 NaN NaN NaN
Upvotes: 1