Reputation: 139
I have a dataframe with a column full of expressions: pd.Series("A + B", "B + C", ...)
it's really long, so I would like to find a vectorized way to eval every expression with a local dictionary {"A": 5, "B": -2.1, ... }
.
I have tried
pd.eval(df["expressions"], local_dict=dict_all_values)
but it does not seem to work, because it returns 101 values, but the expressions are 21656.
Upvotes: 0
Views: 167
Reputation: 25383
That is not vectorized -- I guess that is not possible in the context of this question --, but perhaps it suffices:
# s is the series; d the dictionary
s.map(lambda expr: eval(expr, {}, d))
Or even simpler, as @BeRT2me suggests (and to whom I very much thank) in a comment below:
# s is the series; d the dictionary
s.map(lambda expr: eval(expr, d))
Upvotes: 1
Reputation: 13242
The expr
parameter of pd.eval
is expected to be a str
.
If it is not a string, pd.eval
runs pd.io.formats.printing.pprint_thing
on it to convert it to a string. --> _convert_expression --> pprint_thing
If you run pprint_thing
on a pd.Series
... you get a str
with the first 100 values ending in ...
for the 101st value if there are more than 100.
One method you could try is to properly convert the series to a string yourself:
df['out'] = pd.eval(
",".join(df["expressions"]),
local_dict=dict_all_values,
)
That being said, in all of my testing plain eval
was orders of magnitude faster than anything involving pd.eval
:
df['out'] = df["expressions"].apply(eval, args=(dict_all_values,))
Upvotes: 2
Reputation: 261924
You can map
pandas.eval
passing your dictionary as local_dict
:
df = pd.DataFrame({'expressions': ['A + B', 'B + C']})
dict_all_values = {'A': 5, 'B': -2.1, 'C': 10}
df['out'] = df['expressions'].map(lambda x: pd.eval(x, local_dict=dict_all_values))
Note that your approach seems to work with pandas 2.2.2, but only up to about 100 values, this might be a bug:
df['out'] = pd.eval(df['expressions'], local_dict=dict_all_values)
Thus a comprise could be to apply the transform per group of 99 values:
df['out'] = (df.groupby(np.arange(len(df))//99)['expressions']
.transform(lambda x: pd.eval(x, local_dict=dict_all_values))
)
In my hands this is about 3 times faster than a simple map
.
Output:
expressions out
0 A + B 2.9
1 B + C 7.9
Upvotes: 0