Reputation: 319
Iam writing an Qt based application showing tabular data. The app uses pandas dataframes to store the information. The user should be able to filter the dataframes, e.x.:
df = pandas.DataFrame({
'elevation': [10, 20, 15, 12, 100, 150, 200, 200],
'name': ['tree', 'tree', 'house', 'tree', 'house']
})
df[(elevation > 10) & (elevation < 200)]
df[(elevation > 10) & (elevation < 200) & (name == 'tree')]
How could I construct such filter functions from a text input? I tried to use SymPy to convert the function from the text input and to lambdify it later.
expr = sympify("(x > 10) & (x < 200)")
f = lambdify(x, expr, "numpy")
f(df)
If I use the dataframe as input I got the error "The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()". If I use it with df.any() i got a true or false back, but no index series... For simple x > 10 expressions it's working like expected, or better said, as wanted. Any suggestions?
Upvotes: 0
Views: 544
Reputation: 353429
In this situation you might be able to leverage query
:
>>> df.query("(elevation > 10) & (elevation < 200)")
elevation name
1 20 tree
2 15 house
3 12 tree
4 100 house
5 150 tree
>>> df.query("(elevation > 10) & (elevation < 200) & (name == 'tree')")
elevation name
1 20 tree
3 12 tree
5 150 tree
query
can't handle everything, but it can handle stuff this simple. If you need something more sophisticated, you could use exec
or eval
to construct functions on the fly; there are obvious security hazards there, but you'd have the same issues using sympy
(it uses eval
too.)
Alternatively, you could simply implement your own parser for what you need.
Upvotes: 2