Reputation: 21260
data = {'year': ['11:23:19', '11:23:19', '11:24:19', '11:25:19', '11:25:19', '11:23:19', '11:23:19', '11:23:19', '11:23:19', '11:23:19'],
'store_number': ['1944', '1945', '1946', '1948', '1948', '1949', '1947', '1948', '1949', '1947'],
'retailer_name': ['Walmart', 'Walmart', 'CRV', 'CRV', 'CRV', 'Walmart', 'Walmart', 'CRV', 'CRV', 'CRV'],
'amount': [5, 5, 8, 6, 1, 5, 10, 6, 12, 11],
'id': [10, 10, 11, 11, 11, 10, 10, 11, 11, 10]}
stores = pd.DataFrame(data, columns=['retailer_name', 'store_number', 'year', 'amount', 'id'])
stores.set_index(['retailer_name', 'store_number', 'year'], inplace=True)
stores_grouped = stores.groupby(level=[0, 1, 2])
That looks like:
amount id
retailer_name store_number year
Walmart 1944 11:23:19 5 10
1945 11:23:19 5 10
CRV 1946 11:24:19 8 11
1948 11:25:19 6 11
11:25:19 1 11
Walmart 1949 11:23:19 5 10
1947 11:23:19 10 10
CRV 1948 11:23:19 6 11
1949 11:23:19 12 11
1947 11:23:19 11 10
I manage to filter on:
stores_grouped.filter(lambda x: (len(x) == 1))
But when I want to filter on two conditions:
That my group has length one and id column is equals 10. Any idea ho do so ?
Upvotes: 5
Views: 1063
Reputation: 294218
Thinking outside the box, use drop_duplicates
with keep=False
:
df.drop_duplicates(subset=['retailer_name', 'store_number', 'year'], keep=False) \
.query('id == 10')
Upvotes: 1
Reputation: 862441
You can use:
print (stores_grouped.filter(lambda x: (len(x) == 1) & (x.id == 10).all()))
amount id
retailer_name store_number year
Walmart 1944 11:23:19 5 10
1945 11:23:19 5 10
1949 11:23:19 5 10
1947 11:23:19 10 10
CRV 1947 11:23:19 11 10
Upvotes: 3
Reputation: 210832
i'd do it this way:
In [348]: stores_grouped.filter(lambda x: (len(x) == 1)).query('id == 10')
Out[348]:
amount id
retailer_name store_number year
Walmart 1944 11:23:19 5 10
1945 11:23:19 5 10
1949 11:23:19 5 10
1947 11:23:19 10 10
CRV 1947 11:23:19 11 10
Upvotes: 1
Reputation: 393903
Actually as filter
expects a scalar bool
you can just add the condition in the lambda
like a normal if
style statement:
In [180]:
stores_grouped.filter(lambda x: (len(x) == 1 and x['id'] == 10))
Out[180]:
amount id
retailer_name store_number year
Walmart 1944 11:23:19 5 10
1945 11:23:19 5 10
1949 11:23:19 5 10
1947 11:23:19 10 10
CRV 1947 11:23:19 11 10
Upvotes: 3