Reputation: 7458
I have the following df
,
type id date code
exact 9720 2017-10-01 515
exact 9720 2017-10-01 515
fuzzy 8242 2017-11-01 122
fuzzy 8242 2017-11-01 122
I was trying
exact_rows = df['type'] != 'fuzzy'
grouped = df.loc[~exact_rows].groupby('id').apply(
lambda g: g.sort_values('date', ascending=True))
a = np.where(grouped['code'].transform('nunique') == 1, 20, 0)
but I got an error,
ValueError: transforms cannot produce aggregated results
I am wondering how to resolve the issue.
Upvotes: 6
Views: 6028
Reputation: 59274
IIUC, you have to use transform in a groupby object, so just regroup with the existing whatever index
grouped.groupby(grouped.index)['code'].transform('nunique')
Upvotes: 4
Reputation: 863301
Problem is groupby.apply
return DataFrame
, not DataFrameGroupBy
object:
grouped = df.loc[~exact_rows].groupby('id').apply(
lambda g: g.sort_values('date', ascending=True))
print (grouped)
type id date code
id
8242 2 fuzzy 8242 2017-11-01 122
3 fuzzy 8242 2017-11-01 122
So solution for sorting values per groups is use DataFrame.sort_values
by 2 columns before groupby('id')
:
exact_rows = df['type'] != 'fuzzy'
grouped = df.loc[~exact_rows].sort_values(['id','date'], ascending=True).groupby('id')
a = np.where(grouped['code'].transform('nunique') == 1, 20, 0)
print (a)
[20 20]
Upvotes: 3