Reputation: 1838
I have a Pandas DataFrame, df
:
import pandas as pd
import numpy as np
import math
df = pd.DataFrame({'A':[1,2,2,4,np.nan],'B':[1,2,3,4,5]})
and a dict, mask
:
mask = {1:32,2:64,3:100,4:200}
I want my end result to be a DataFrame like this:
A B C
1 1 32
2 2 64
2 3 96
4 4 400
nan nan nan
Right now I am doing this, which seems innefficient:
for idx, row in df.iterrows():
if not math.isnan(row['A']):
if row['A'] != 1:
df.loc[idx, 'C'] = row['B'] * mask[row['A'] - 1]
else:
df.loc[idx, 'C'] = row['B'] * mask[row['A']]
Is there an easy way to vectorize this?
Upvotes: 2
Views: 774
Reputation: 294498
This should work:
df['C'] = df.B * (df.A - (df.A != 1)).map(mask)
10,000 rows
# Initialize each run with
df = pd.DataFrame({'A':[1,2,2,4,np.nan],'B':[1,2,3,4,5]})
df = pd.concat([df for _ in range(2000)])
100,000 rows
# Initialize each run with
df = pd.DataFrame({'A':[1,2,2,4,np.nan],'B':[1,2,3,4,5]})
df = pd.concat([df for _ in range(20000)])
Upvotes: 3
Reputation: 215067
Here is an option using apply
, and the get
method for dictionary which returns None
if the key is not in the dictionary:
df['C'] = df.apply(lambda r: mask.get(r.A) if r.A == 1 else mask.get(r.A - 1), axis = 1) * df.B
df
# A B C
#0 1 1 32
#1 2 2 64
#2 2 3 96
#3 4 4 400
#4 NaN 5 NaN
Upvotes: 3