Reputation: 15207
Suppose I have a DataFrame
like so,
df = pd.DataFrame([['x', 1, 2], ['x', 1, 3], ['y', 2, 2]],
columns=['a', 'b', 'c'])
To select all rows where c == 2
and a == 'x'
, I could do something like,
df[(df['a'] == 'x') & (df['c'] == 2)]
Or I could iterative refine by making temporary variables,
df1 = df[df['a'] == 'x']
df2 = df1[df1['c'] == 2]
Is there a way to iterative refine on rows?
(
df
.refine(lambda row: row['a'] == 'x') # this method doesn't exist
.refine(lambda row: row['c'] == 2)
)
Upvotes: 3
Views: 1680
Reputation: 8516
If you have a number of terms; the number of which you don't know until runtime, you can do as below. I am not saying this is at all a beautiful way to achieve the goal but I can't see an alternative with Pandas 0.14.1:
df = pd.DataFrame([['x', 1, 2], ['x', 1, 3], ['y', 2, 2]],
columns=['a', 'b', 'c'])
conditions = {'a': 'x', 'c': 2}
def esc(term):
if isinstance(term, str):
return '"%s"' % term
return str(term)
q_parts = ["%s == %s" % (k, esc(v)) for k, v in conditions.items()]
q = ' and '.join(q_parts)
print df.query(q)
Of course, the esc function or the wider snippet would need to be extended further to handle logical-NOT, is x in (x, y, z), etc...
Upvotes: 0
Reputation: 25662
While this isn't a solution for now, in pandas version 0.13 you'll be able to do
df.query('a == "x"').query('c == 2')
to achieve what you want.
You'll also be able to do
df['a == "x"']['c == 2']
and
df['a == "x" and c == 2']
What's wrong with
df[(df.a == 'x') & (df.c == 2)]
until 0.13?
Upvotes: 1