konichiwa
konichiwa

Reputation: 551

How to apply multiple custom functions on multiple columns in grouped DataFrame in pandas?

I have a pandas DataFrame which is grouped by p_id. The goal is to get a DataFrame with data shown under 'Output I'm looking for'. I've tried a few things, but I am struggling applying two custom aggregated functions:

How can I solve this problem?

Input

| p_id | x_id | x_name |
|------|------|--------|
| 1    | 4    | Text   |
| 2    | 4    | Text   |
| 2    | 5    | Text2  |
| 2    | 6    | Text3  |
| 3    | 4    | Text   |
| 3    | 7    | Text4  |

Output I'm looking for

| p_id | x_ids   | x_names            |
|------|---------|--------------------|
| 1    | [4]     | Text               |
| 2    | [4,5,6] | Text||Text2||Text3 |
| 3    | [4,7]   | Text||Text4        |

Upvotes: 1

Views: 87

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150735

You can certainly do:

df.groupby('pid').agg({'x_id':list, 'x_name':'||'.join})

Or a little more advanced with named agg:

df.groupby('pid').agg(x_ids=('x_id',list),
                      x_names=('x_name', '||'.join))

Upvotes: 1

Related Questions