Reputation: 4036
I have a df:
dfs = """
contract Valindex0 RB Valindex1
2 A00118 51 0 50
3 A00118 42 1 47
4 A00118 44 1 47
"""
df = pd.read_csv(StringIO(dfs.strip()), sep='\s+')
df:
contract Valindex0 RB Valindex1
2 A00118 51 0 50
3 A00118 42 1 47
4 A00118 44 1 47
I want to add a new column to each row df['Valindex'],
this column value is either
df['Valindex0']
or
df['Valindex1']
it depends on the df['RB']:
if df['RB']==0:
df['Valindex'] = df['Valindex0']
elif df['RB']==1:
df['Valindex'] = df['Valindex1']
Now I'm using apply lambda,but it is very slow:
df['Valindex'] = df.apply(
lambda df: df["Valindex" + str(df["RB"])], axis=1)
The output should looks like:
contract Valindex0 RB Valindex1 Valindex
2 A00118 51 0 50 51
3 A00118 42 1 47 47
4 A00118 44 1 47 47
Any faster way?
Upvotes: 3
Views: 1104
Reputation: 249
Running a bunch of methods and timing each method on 1000 rows, it seems Method 3 as suggested by Anurag works best. The timing varies each run, but the ranking doesn't change.
#Method 1 (original) - 0.0168s
df['Valindex'] = df.apply(lambda df: df["Valindex" + str(df["RB"])], axis=1)
#Method 2 - 0.0015s
df['Valindex'] = (df['RB'] == 0 ) * df['Valindex0'] + df['RB'] * df['Valindex1']
#Method 3 - 0.0009s
df["Valindex"] = np.where(df["RB"].eq(0), df["Valindex0"], df["Valindex1"])
#Method 4 - 0.0014s
df["Valindex"] = df["Valindex0"].where(df["RB"].eq(0), df["Valindex1"])
#Method 5 - 0.0022s
df['Valindex'] = df['Valindex0']
df.loc[df.RB==1,'Valindex'] = df['Valindex1']
Upvotes: 2
Reputation: 323276
Let us jus try assign
df['Valindex'] = df['Valindex0']
df.loc[df.RB==1,'Valindex'] = df['Valindex1']
Upvotes: 2
Reputation: 24314
use np.where()
:
df["Valindex"] = np.where(df["RB"].eq(0), df["Valindex0"], df["Valindex1"])
OR
use np.select()
for multiple cases and conditions:
conditions = [df["RB"].eq(0), df["RB"].eq(1)]
labels = [df["Valindex0"], df["Valindex1"]]
df["Valindex"] = np.select(conditions, labels)
output of df
:
contract Valindex0 RB Valindex1 Valindex
2 A00118 51 0 50 51
3 A00118 42 1 47 47
4 A00118 44 1 47 47
Upvotes: 7
Reputation: 20669
You can try Series.where
.
df["Valindex"] = df["Valindex0"].where(df["RB"].eq(0), df["Valindex1"])
contract Valindex0 RB Valindex1 Valindex
2 A00118 51 0 50 51
3 A00118 42 1 47 47
4 A00118 44 1 47 47
Upvotes: 4