Brayan Muñoz
Brayan Muñoz

Reputation: 139

How to eval series of expressions in Pandas

I have a dataframe with a column full of expressions: pd.Series("A + B", "B + C", ...) it's really long, so I would like to find a vectorized way to eval every expression with a local dictionary {"A": 5, "B": -2.1, ... }.

I have tried

pd.eval(df["expressions"], local_dict=dict_all_values)

but it does not seem to work, because it returns 101 values, but the expressions are 21656.

Upvotes: 0

Views: 167

Answers (3)

PaulS
PaulS

Reputation: 25383

That is not vectorized -- I guess that is not possible in the context of this question --, but perhaps it suffices:

# s is the series; d the dictionary
s.map(lambda expr: eval(expr, {}, d))

Or even simpler, as @BeRT2me suggests (and to whom I very much thank) in a comment below:

# s is the series; d the dictionary
s.map(lambda expr: eval(expr, d))

Upvotes: 1

BeRT2me
BeRT2me

Reputation: 13242

The expr parameter of pd.eval is expected to be a str.

If it is not a string, pd.eval runs pd.io.formats.printing.pprint_thing on it to convert it to a string. --> _convert_expression --> pprint_thing

If you run pprint_thing on a pd.Series ... you get a str with the first 100 values ending in ... for the 101st value if there are more than 100.


One method you could try is to properly convert the series to a string yourself:

df['out'] = pd.eval(
    ",".join(df["expressions"]),
    local_dict=dict_all_values,
)

That being said, in all of my testing plain eval was orders of magnitude faster than anything involving pd.eval:

df['out'] = df["expressions"].apply(eval, args=(dict_all_values,))

Upvotes: 2

mozway
mozway

Reputation: 261924

You can map pandas.eval passing your dictionary as local_dict:

df = pd.DataFrame({'expressions': ['A + B', 'B + C']})
dict_all_values = {'A': 5, 'B': -2.1, 'C': 10}

df['out'] = df['expressions'].map(lambda x: pd.eval(x, local_dict=dict_all_values))

Note that your approach seems to work with pandas 2.2.2, but only up to about 100 values, this might be a bug:

df['out'] = pd.eval(df['expressions'], local_dict=dict_all_values)

Thus a comprise could be to apply the transform per group of 99 values:

df['out'] = (df.groupby(np.arange(len(df))//99)['expressions']
               .transform(lambda x: pd.eval(x, local_dict=dict_all_values))
            )

In my hands this is about 3 times faster than a simple map.

Output:

  expressions  out
0       A + B  2.9
1       B + C  7.9

Upvotes: 0

Related Questions