Reputation: 585
imagine a dataset like the following:
df = pd.DataFrame({'Contacts 6M':[4,7,20,5,6,0,1,19], 'Contacts 3M':[2,3,9,np.nan,np.nan,0,np.nan,9]})
As you can imagine: Column 'Contacts 6M' is the counted number of contacts in the last 6 month where the other column holds the information of the number of contacts in the last 3 month. So 'Contacts 3M' includes parts of the information of the other column.
I impute the missing values with the method forward fill:
df.ffill(axis = 1, inplace=True)
My question: How do I divide the imputed value by 2 and round the imputed values (please no floats) while iterating over the dataset?
Upvotes: 0
Views: 137
Reputation: 415
It can be easily done by this way:
df.iloc[df[df['Contacts 3M'].isna()].index,1]=df[df['Contacts 3M'].isna()]['Contacts 6M']/2
df['Contacts 3M']=df['Contacts 3M'].astype('int')
Upvotes: 1
Reputation: 1784
You could keep track of the indices where you had np.nan
and later use it do any arithmetic you want to-
import pandas as pd
import numpy as np
df = pd.DataFrame({'Contacts 6M': [4, 7, 20, 5, 6, 0, 1, 19], 'Contacts 3M': [2, 3, 9, np.nan, np.nan, 0, np.nan, 9]}, dtype=np.int)
mask = df['Contacts 3M'].isna()
df = df.ffill(axis=1) # for some weird reason, inplace=True was throwing 'NotImplementedError'
df['Contacts 3M'][mask] //= 2
print(df)
Output
Contacts 6M Contacts 3M
0 4 2
1 7 3
2 20 9
3 5 2
4 6 3
5 0 0
6 1 0
7 19 9
Upvotes: 1