Reputation: 1237
I would like to add a number to same room occurrences with the same ID. So my dataframe has two columns ('ID' and 'Room'). I want to add a number to each Room according to its occurrence in the column 'room' for a single ID. Underneath is the original df and the desired df.
Example: ID34 has 3 bedrooms so I want the first to be -> bedroom_1, the second -> bedroom_2 and the third -> bedroom_3.
original df:
ID Room
34 Livingroom
34 Bedroom
34 Kitchen
34 Bedroom
34 Bedroom
34 Storage
50 Kitchen
50 Kitchen
89 Livingroom
89 Bedroom
89 Bedroom
98 Livingroom
Desired df:
ID Room
34 Livingroom_1
34 Bedroom_1
34 Kitchen_1
34 Bedroom_2
34 Bedroom_3
34 Storage_1
50 Kitchen_1
50 Kitchen_2
89 Livingroom_1
89 Bedroom_1
89 Bedroom_2
98 Livingroom_1
Tried code:
import pandas as pd
import numpy as np
data = pd.DataFrame({"ID": [34,34,34,34,34,34,50,50,89,89,89,98],
"Room": ['Livingroom','Bedroom','Kitchen','Bedroom','Bedroom','Storage','Kitchen','Kitchen','Livingroom','Bedroom','Bedroom','Livingroom']})
df = pd.DataFrame(columns=['ID'])
for i in range(df['Room'].nunique()):
df_new = (df[df['Room'] == ])
df_new.columns = ['ID', 'Room' + str(i)]
df_result = df_result.merge(df_new, on='ID', how='outer')
Upvotes: 0
Views: 480
Reputation: 71
Here is some code that can do that for you. I'm basically breaking it down into three steps.
def f(rooms_col):
arr = np.empty(len(rooms_col), dtype=object)
for i, name in enumerate(rooms_col):
arr[i] = name + f"_{i + 1}"
return arr
# assuming data is the data from above
tmp_df = data.groupby(["ID", "Room"])["Room"].apply(f)
# Drop the old room name
tmp_df.index = tmp_df.index.droplevel(1)
# Explode the results array -> 1 row per entry
df = tmp_df.explode()
print(df)
Here is your output:
ID
34 Bedroom_1
34 Bedroom_2
34 Bedroom_3
34 Kitchen_1
34 Livingroom_1
34 Storage_1
50 Kitchen_1
50 Kitchen_2
89 Bedroom_1
89 Bedroom_2
89 Livingroom_1
98 Livingroom_1
Name: Room, dtype: object
Upvotes: 1
Reputation: 26676
Lets try concatenate room with the cumcount of the df grouped by Room and ID as follows
df=df.assign(Room=df.Room+"_"+(df.groupby(['ID','Room']).cumcount()+1).astype(str))
ID Room
0 34 Livingroom_1
1 34 Bedroom_1
2 34 Kitchen_1
3 34 Bedroom_2
4 34 Bedroom_3
5 34 Storage_1
6 50 Kitchen_1
7 50 Kitchen_2
8 89 Livingroom_1
9 89 Bedroom_1
10 89 Bedroom_2
11 98 Livingroom_1
Upvotes: 1
Reputation: 544
import inflect
p = inflect.engine()
df['Room'] += df.groupby('Room').cumcount().add(1).map(p.ordinal).radd('_')
print(df)
https://stackoverflow.com/a/59951701/3756587 I copied from here.
Upvotes: 1