daiyue
daiyue

Reputation: 7458

pandas ValueError: transforms cannot produce aggregated results

I have the following df,

type      id      date         code
exact    9720    2017-10-01    515
exact    9720    2017-10-01    515
fuzzy    8242    2017-11-01    122
fuzzy    8242    2017-11-01    122

I was trying

exact_rows = df['type'] != 'fuzzy'
grouped = df.loc[~exact_rows].groupby('id').apply(
            lambda g: g.sort_values('date', ascending=True))

a = np.where(grouped['code'].transform('nunique') == 1, 20, 0)

but I got an error,

ValueError: transforms cannot produce aggregated results

I am wondering how to resolve the issue.

Upvotes: 6

Views: 6028

Answers (2)

rafaelc
rafaelc

Reputation: 59274

IIUC, you have to use transform in a groupby object, so just regroup with the existing whatever index

grouped.groupby(grouped.index)['code'].transform('nunique')

Upvotes: 4

jezrael
jezrael

Reputation: 863301

Problem is groupby.apply return DataFrame, not DataFrameGroupBy object:

grouped = df.loc[~exact_rows].groupby('id').apply(
            lambda g: g.sort_values('date', ascending=True))

print (grouped)
         type    id        date  code
id                                   
8242 2  fuzzy  8242  2017-11-01   122
     3  fuzzy  8242  2017-11-01   122

So solution for sorting values per groups is use DataFrame.sort_values by 2 columns before groupby('id'):

exact_rows = df['type'] != 'fuzzy'
grouped = df.loc[~exact_rows].sort_values(['id','date'], ascending=True).groupby('id')

a = np.where(grouped['code'].transform('nunique') == 1, 20, 0)
print (a)
[20 20]

Upvotes: 3

Related Questions