Chethan
Chethan

Reputation: 611

Perform True/False operation on a column based on the condition present in another column in pandas

I have a dataframe

df_in = pd.DataFrame([[1,"A",32,">30"],[2,"B",12,"<10"],[3,"C",45,">=45"]],columns=['id', 'input', 'val', 'cond'])

I want to perform an operation on column "val" based on the condition present in "cond" column and get the True/False result in "Output" column.

Expected Output:

df_out = pd.DataFrame([[1,"A",32,">30",True],[2,"B",12,"<10",False],[3,"C",45,">=45",True]],columns=['id', 'input', 'val', 'cond',"Output"])

How to do it?

Upvotes: 0

Views: 377

Answers (2)

Anurag Dabas
Anurag Dabas

Reputation: 24304

you can try:

df_in['output']=pd.eval(df_in['val'].astype(str)+df_in['cond'])

OR

If needed performance use the below method but also see this thread but I think in your case it is safe to use eval:

df_in['output']=list(map(lambda x:eval(x),(df_in['val'].astype(str)+df_in['cond']).tolist()))

OR

Even more efficient and fastest:

from numpy.core import defchararray

df_in['output']=list(map(lambda x:eval(x),defchararray.add(df_in['val'].values.astype(str),df_in['cond'].values)))

output of df_in:

    id  input   val     cond    output
0   1   A       32      >30     True
1   2   B       12      <10     False
2   3   C       45      >=45    True

Time Comparison: using %%timeit -n 1000

enter image description here

Upvotes: 2

Pygirl
Pygirl

Reputation: 13349

Using numexpr

import numexpr
df_in['output'] = df_in.apply(lambda x: numexpr.evaluate(f"{x['val']}{x['cond']}"), axis=1 )

   id  input   val     cond    output
0   1   A       32      >30     True
1   2   B       12      <10     False
2   3   C       45      >=45    True

Time Comparison: using %%timeit -n 1000

using apply and numexpr:

865 µs ± 140 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

using pd.eval:

2.5 ms ± 363 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Upvotes: 1

Related Questions