Reputation: 939
I have a df which looks like this:
df = pd.DataFrame({'Date':['2019-09-23','2019-09-24','2019-09-25','2019-09-26','2019-09-27','2019-09-28','2019-09-29'],
'Sep':[1,10,5,'NaN','NaN','NaN','NaN'],
'Dec':[2,8,4,7,9,1,5]})
I'm trying to create a new column called 'First_Contract':
Desired output:
df2= pd.DataFrame({'Date':['2019-09-23','2019-09-24','2019-09-25','2019-09-26','2019-09-27','2019-09-28','2019-09-29'],
'Sep':[1,10,5,'NaN','NaN','NaN','NaN'],
'Dec':[2,8,4,7,9,1,5],
'First_Contract':[1,8,4,7,9,1,5]})
How do I go about to achieve this?
Upvotes: 0
Views: 688
Reputation: 323306
Let us do it step by step
df.Sep.replace({'NaN': np.nan}, inplace=True)
df['FC'] = df['Dec']
ids = df.Sep.last_valid_index()-2
df.loc[ids,'FC'] = df.Sep[ids]
df
Out[126]:
Date Sep Dec First_Contract FC
0 2019-09-23 1.0 2 1 1.0
1 2019-09-24 10.0 8 8 8.0
2 2019-09-25 5.0 4 4 4.0
3 2019-09-26 NaN 7 7 7.0
4 2019-09-27 NaN 9 9 9.0
5 2019-09-28 NaN 1 1 1.0
6 2019-09-29 NaN 5 5 5.0
Upvotes: 1
Reputation: 16147
You can use numpy to fill fill in Sep
where the index is 3 behind the first null index, and fill the rest with Dec
import pandas as pd
import numpy as np
df = pd.DataFrame({'Date':['2019-09-23','2019-09-24','2019-09-25','2019-09-26','2019-09-27','2019-09-28','2019-09-29'],
'Sep':[1,10,5,np.nan,np.nan,np.nan,np.nan],
'Dec':[2,8,4,7,9,1,5]})
df['First_Contract'] = np.where(df.index==df.Sep.isnull().idxmax()-3, df.Sep, df.Dec)
Upvotes: 1
Reputation: 3639
You can use pd.concat
and last_valid_index
to create your column:
df['First_contract'] = pd.concat((
df.Sep.iloc[:df.Sep.last_valid_index() - 1], df.Dec.iloc[df.Sep.last_valid_index() - 1:]
)).astype(int)
Complete code (I replaced strings 'NaN' with np.nan
in Sep
column; it is not needed if they are already NaN):
import pandas as pd
import numpy as np
df = pd.DataFrame({'Date':['2019-09-23','2019-09-24','2019-09-25','2019-09-26','2019-09-27','2019-09-28','2019-09-29'],
'Sep':[1,10,5, 'NaN','NaN','NaN','NaN'],
'Dec':[2,8,4,7,9,1,5]})
df.Sep.replace({'NaN': np.nan}, inplace=True)
df['First_contract'] = pd.concat((
df.Sep.iloc[:df.Sep.last_valid_index() - 1], df.Dec.iloc[df.Sep.last_valid_index() - 1:]
)).astype(int)
Output:
Date Sep Dec First_contract
0 2019-09-23 1.0 2 1
1 2019-09-24 10.0 8 8
2 2019-09-25 5.0 4 4
3 2019-09-26 NaN 7 7
4 2019-09-27 NaN 9 9
5 2019-09-28 NaN 1 1
6 2019-09-29 NaN 5 5
Upvotes: 1