Reputation: 559

How to transform the result of a Pandas `GROUPBY` function to the original dataframe

Suppose I have a Pandas DataFrame with 6 columns and a custom function that takes counts of the elements in 2 or 3 columns and produces a boolean output. When a groupby object is created from the original dataframe and the custom function is applied df.groupby('col1').apply(myfunc), the result is a series whose length is equal to the number of categories of col1. How do I expand this output to match the length of the original dataframe? I tried transform, but was not able to use the custom function myfunc with it.

EDIT:

Here is an example code:

A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})
print (A)

def myfunc(df):
    return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))

A.groupby('X').apply(myfunc)

I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.

Upvotes: 2

Answers (2)

Zealseeker

Reputation: 823

My solution may not be the best one, which uses a loop, but it's pretty good I think.

The core idea is you can traverse all the sub-dataframe (gdf) by for i, gdf in gp. Then add the column result (in my example it is c) for each sub-dataframe. Finally concat all the sub-dataframe into one.

Here is an example:

import pandas as pd
df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})
gp = df.groupby('a')  # group
s = gp.apply(sum)['a'] # apply a func
adf = []

# then create a new dataframe
for i, gdf in gp:
    tdf = gdf.copy()
    tdf.loc[:,'c'] = s.loc[i]
    adf.append(tdf)
pd.concat(adf)

from:

to:

    a   b   c
0   1   a   2
2   1   c   2
1   2   b   4
3   2   d   4

Upvotes: 0

Zito Relova

Reputation: 1041

You can map the groupby back to the original dataframe

A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))

Result would look like:

    X   Y   Z   Result
0   a   at  q   True
1   b   bt  q   False
2   c   ct  r   True
3   a   at  r   True
4   c   ct  s   True

Upvotes: 1

How to transform the result of a Pandas `GROUPBY` function to the original dataframe

Answers (2)

Related Questions