Bikram Baruah
Bikram Baruah

Reputation: 91

Transposing column values into rows and grouping them based on value of another column

I have a dataframe that looks like this:

Value     group_id
23        1
28        1
32        1 
....
35        12
23        12
42        12

There are 6 unique group_id in total, each with few hundred elements. I want to convert this into a dataframe that has the value for a single 'group_id' in one single row. So I want my dataframe to look like this:

value_1 value_2 value_3 group_id
23       28      32      1
....
35       23      42      12

I tried using pandas.melt and pandas.groupby but failed to get any results.

Upvotes: 2

Views: 73

Answers (2)

asongtoruin
asongtoruin

Reputation: 10359

Use groupby with cumcount to fill in your 'value_' records, and then use .pivot_table on the result:

df['val_grp'] = 'value_' + (df.groupby('group_id').cumcount() + 1).astype(str)

pvt = df.pivot_table(index='group_id', columns='val_grp', values='Value')

print(pvt)

Prints:

val_grp   value_1  value_2  value_3
group_id                           
1              23       28       32
12             35       23       42

As mentioned in the comments by andrew_reece, this can be done in a single line by doing:

pvt = df.assign(idx=(df.groupby("group_id").cumcount()+1).astype(str).str.replace("^(.)", "value_\\1", regex=True)).pivot(index="group_id", columns="idx", values="Value")

However, it should be noted that this requires Pandas 23.0 or higher, as this is when the regex parameter for str.replace was introduced.

Upvotes: 2

piRSquared
piRSquared

Reputation: 294338

You are missing the the position of the 'Value' within each group. We can create it with groupby.cumcount

df.set_index(
    ['group_id', df.groupby('group_id').cumcount() + 1]
).Value.unstack().add_prefix('Value_').reset_index()

   group_id  Value_1  Value_2  Value_3
0         1       23       28       32
1        12       35       23       42

Upvotes: 3

Related Questions