Reputation: 401
I have DataFrame named 'concated'. It has columns: 'amount' - with a sums of transactions, 'mcccode_trtype' with a merchant type. I need to count only negative amounts of transactions by merchant code and count the mean of these transactions. And I need to filter merchants with more than 10 transactions.
So, I wrote this code:
res=concated[concated.amount<0].groupby('mcccode_trtype')['amount'].agg(['count', 'mean'])
The problem: How can I filter this by 'count' column in one line? I guess I should use 'filter' + lambda function, but finally failed with syntax.
Please, help.
Sample data:
from numpy import nan
concated=pd.DataFrame(
{0: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '0 10:23:26',
'tr_description': 'DESCR2',
'tr_type': 1030},
1: {'amount': -5614.79,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '6 07:08:31',
'tr_description': 'DESCR2',
'tr_type': 1030},
2: {'amount': -1122.96,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '8 07:06:10',
'tr_description': 'DESCR2',
'tr_type': 1030},
3: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '11 08:49:03',
'tr_description': 'DESCR2',
'tr_type': 1030},
4: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '11 14:12:08',
'tr_description': 'DESCR2',
'tr_type': 1030},
5: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '11 14:15:30',
'tr_description': 'DESCR2',
'tr_type': 1030},
6: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '13 11:17:34',
'tr_description': 'DESCR2',
'tr_type': 1030},
7: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '18 07:39:05',
'tr_description': 'DESCR2',
'tr_type': 1030},
8: {'amount': -449.18,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '19 13:24:35',
'tr_description': 'DESCR2',
'tr_type': 1030},
9: {'amount': -1122.96,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '19 13:25:31',
'tr_description': 'DESCR2',
'tr_type': 1030}}).transpose()
concated
Upvotes: 2
Views: 42
Reputation: 430
You can use .loc
in combination with a lambda function as follows:
res = concated[concated.amount<0].groupby('mcccode_trtype')['amount'].agg(['count', 'mean']).loc[lambda x: x["count"] > 10]
Upvotes: 1
Reputation: 863246
Use DataFrame.query
for filter by column count
:
#if necessary
concated.amount = concated.amount.astype(float)
res=concated[concated.amount<0].groupby('mcccode_trtype')['amount'].agg(['count', 'mean']).query('count == 10')
Upvotes: 0