Matthi9000
Matthi9000

Reputation: 1237

Add number to grouped pandas values in accordance to their occurrence in the group

I would like to add a number to same room occurrences with the same ID. So my dataframe has two columns ('ID' and 'Room'). I want to add a number to each Room according to its occurrence in the column 'room' for a single ID. Underneath is the original df and the desired df.

Example: ID34 has 3 bedrooms so I want the first to be -> bedroom_1, the second -> bedroom_2 and the third -> bedroom_3.

original df:

ID     Room
34     Livingroom
34     Bedroom
34     Kitchen
34     Bedroom
34     Bedroom
34     Storage
50     Kitchen
50     Kitchen
89     Livingroom
89     Bedroom
89     Bedroom
98     Livingroom

Desired df:

ID     Room
34     Livingroom_1
34     Bedroom_1
34     Kitchen_1
34     Bedroom_2
34     Bedroom_3
34     Storage_1
50     Kitchen_1
50     Kitchen_2
89     Livingroom_1
89     Bedroom_1
89     Bedroom_2
98     Livingroom_1

Tried code:

import pandas as pd

import numpy as np



data = pd.DataFrame({"ID": [34,34,34,34,34,34,50,50,89,89,89,98],
                     
                   "Room": ['Livingroom','Bedroom','Kitchen','Bedroom','Bedroom','Storage','Kitchen','Kitchen','Livingroom','Bedroom','Bedroom','Livingroom']})

df = pd.DataFrame(columns=['ID'])
for i in range(df['Room'].nunique()):
    df_new = (df[df['Room'] == ])

    df_new.columns = ['ID', 'Room' + str(i)]
      
    df_result = df_result.merge(df_new, on='ID', how='outer')





Upvotes: 0

Views: 480

Answers (3)

njriasan
njriasan

Reputation: 71

Here is some code that can do that for you. I'm basically breaking it down into three steps.

  1. Perform a groupby apply to get apply a custom function on a group by operation. This allows you to generate new names for each pair of ID, Room.
  2. Reduce the multiindex to the original index. Because we are grouping on two columns the index is now a hierarchical grouping of the two columns. We are discarding the original because we want to use our new names.
  3. Perform an explode on each entry. This is because for simplicity, we are computing the apply result as an array. A subsequent explode then give each element in the array a unique row.
def f(rooms_col):
    arr = np.empty(len(rooms_col), dtype=object)
    for i, name in enumerate(rooms_col):
        arr[i] = name + f"_{i + 1}"
    return arr

# assuming data is the data from above
tmp_df = data.groupby(["ID", "Room"])["Room"].apply(f)
# Drop the old room name
tmp_df.index = tmp_df.index.droplevel(1)
# Explode the results array -> 1 row per entry
df = tmp_df.explode()
print(df)

Here is your output:

ID
34       Bedroom_1
34       Bedroom_2
34       Bedroom_3
34       Kitchen_1
34    Livingroom_1
34       Storage_1
50       Kitchen_1
50       Kitchen_2
89       Bedroom_1
89       Bedroom_2
89    Livingroom_1
98    Livingroom_1
Name: Room, dtype: object

Upvotes: 1

wwnde
wwnde

Reputation: 26676

Lets try concatenate room with the cumcount of the df grouped by Room and ID as follows

df=df.assign(Room=df.Room+"_"+(df.groupby(['ID','Room']).cumcount()+1).astype(str))



ID          Room
0   34  Livingroom_1
1   34     Bedroom_1
2   34     Kitchen_1
3   34     Bedroom_2
4   34     Bedroom_3
5   34     Storage_1
6   50     Kitchen_1
7   50     Kitchen_2
8   89  Livingroom_1
9   89     Bedroom_1
10  89     Bedroom_2
11  98  Livingroom_1

Upvotes: 1

r.burak
r.burak

Reputation: 544

import inflect
p = inflect.engine()

df['Room'] += df.groupby('Room').cumcount().add(1).map(p.ordinal).radd('_')
print(df)

https://stackoverflow.com/a/59951701/3756587 I copied from here.

Upvotes: 1

Related Questions