Reputation: 723
I'm trying to recode data on loan statuses so that every observation is either Default or Fully Paid. Specifically, I'd like to recode anyone != 'Fully Paid' as 'Default'.
Here are my values:
df.loan_status.unique()
array(['Fully Paid', 'Charged Off', 'Default', 'Late (31-120 days)',
'In Grace Period', 'Late (16-30 days)',
'Does not meet the credit policy. Status:Fully Paid',
'Does not meet the credit policy. Status:Charged Off', 'Issued'], dtype=object)
I tried the following code but all observations got recoded as 'Default':
statuses= df['loan_status'].unique()
for status in statuses:
if status!='Fully Paid':
df['loan_status']='Default'
Any advice on how to do this would be greatly appreciated!
Upvotes: 2
Views: 601
Reputation: 294258
I like this approach.
Andras Deak / MaxU; option 1
df.loc[df.loan_status.ne('Fully Paid'), 'loan_status'] = 'Default'
Option 2
pd.Series.where
ls = df.loan_status
df.update(ls.where(ls.eq('Fully Paid'), 'Default'))
Option 3
pd.Series.mask
ls = df.loan_status
df.update(ls.mask(ls.ne('Fully Paid')).fillna('Default'))
Option 4
numpy.where
ls = df.loan_status.values
paid, dflt = 'Fully Paid', 'Default'
df.loc[:, 'loan_status'] = np.where(ls == paid, paid, dflt)
Upvotes: 1