guyguyguy12345
guyguyguy12345

Reputation: 571

Pass grouping index value as argument to the function applied in `groupby`

How can I pass grouping index value as an additional argument alongside the group's subdataframe?

This crude example just applies a univariate function:

df = pd.DataFrame(data=np.random.randint(0,10, size=(3,3)), index = ['a','b','a'])
t = df.groupby(df.index).apply(lambda x: ''.join(str(x)))

    0   1   2
a   8   6   7
b   6   2   4
a   8   2   4

This function accepts as argument the index upon which the dataframe was grouped.

def f(g, indx):
    return ''.join(str(x)) +'___' str(indx)

The output should be:

  0
a '8  6  7  8  2  4___a'
b '6  2  4___b'

I understand that this example is trivial, but the point is to pass the grouping index value as argument alongside the grouped subdataframe. The solution I see is to iterate over the grouping object. I am not sure it's good solution performance-wise.

Mathematica has MapIndexed function that does the job but without prior grouping. It seems this question was asked before.

Upvotes: 0

Views: 372

Answers (1)

sitting_duck
sitting_duck

Reputation: 3720

You can get to the index name via .name. So you do something like:

df.groupby(df.index).apply(lambda x: ''.join(str(x.values)) + '___' + str(x.name))

The output is not exactly what you want but figured I'd get this info to you quickly. Assume that you can clean it up to what you want.

Output (older version of your data):

a    [[8 4 6]\n [6 8 9]]___a
b              [[1 3 2]]___b

Upvotes: 1

Related Questions