Reputation: 101
Loops in python taking alot time to give result.This contains around 100k records.
It is taking lot of time. How time can be reduced
df['loan_agr'] = df['loan_agr'].astype(int)
for i in range(len(df)):
if df.loc[i,'order_mt']== df.loc[i,'enr_mt']:
df['new_N_Loan'] = 1
df['exist_N_Loan'] = 0
df['new_V_Loan'] = df['loan_agr']
df['exist_V_Loan'] = 0
else:
df['new_N_Loan'] = 0
df['exist_N_Loan'] = 1
df['new_V_Loan'] = 0
df['exist_V_Loan'] = df['loan_agr']
Upvotes: 1
Views: 617
Reputation: 15662
You can use loc
and set the new values in a vectorized way. This approach is much faster than using iteration because these operations are performed on entire columns at once, rather than individual values. Check out this article for more on speed optimization in pandas.
For example:
mask = df['order_mt'] == df['enr_mt']
df.loc[mask, ['new_N_Loan', 'exist_N_Loan', 'exist_V_Loan']] = [1, 0, 0]
df.loc[mask, ['new_V_Loan']] = df['loan_agr']
df.loc[~mask, ['new_N_Loan', 'exist_N_Loan', 'new_V_Loan']] = [0, 1, 0]
df.loc[~mask, ['exist_V_Loan']] = df['loan_agr']
Edit:
If the ~
(bitwise not) operator is not supported in your version of pandas, you can make a new mask for the "else" condition, similar to the first condition.
For example:
mask = df['order_mt'] == df['enr_mt']
else_mask = df['order_mt'] != df['enr_mt']
Then use the else_mask
for the second set of definitions instead of ~mask
.
Sample:
Input:
order_mt enr_mt new_N_Loan exist_N_Loan exist_V_Loan new_V_Loan loan_agr
0 1 1 None None None None 100
1 2 2 None None None None 200
2 3 30 None None None None 300
3 4 40 None None None None 400
Output:
order_mt enr_mt new_N_Loan exist_N_Loan exist_V_Loan new_V_Loan loan_agr
0 1 1 1 0 0 100 100
1 2 2 1 0 0 200 200
2 3 30 0 1 300 0 300
3 4 40 0 1 400 0 400
Upvotes: 5
Reputation: 109
Instead of range(Len(...)) you could change the len function to a value.
Upvotes: 0