Reputation: 3667
I have a dataframe
City hits
A 10
B 1
C 22
D 122
E 1
F 165
I want to make a dataframe with a single column called Hit_ID, whose rows are
A_1
A_2
etc.
A_10
B_1
C_1
C_2
etc.
D_124
D_125
E_1
I tried
df['Hit_ID'] = df[['City','hits']].apply(lambda x:'_'.join(x.astype(str)),axis=1)
but this just concatenates the columns. How can I modify the apply function to get the desired op?
Upvotes: 2
Views: 941
Reputation: 863166
You need:
#repeat index values by column hits
df = df.loc[df.index.repeat(df['hits'])].reset_index(drop=True)
#create counts
df['hits'] = df.groupby('City').cumcount() + 1
#concatenate columns
df['Hit_ID'] = df['City'] + '_' + df['hits'].astype(str)
print (df.head(20))
City hits Hit_ID
0 A 1 A_1
1 A 2 A_2
2 A 3 A_3
3 A 4 A_4
4 A 5 A_5
5 A 6 A_6
6 A 7 A_7
7 A 8 A_8
8 A 9 A_9
9 A 10 A_10
10 B 1 B_1
11 C 1 C_1
12 C 2 C_2
13 C 3 C_3
14 C 4 C_4
15 C 5 C_5
16 C 6 C_6
17 C 7 C_7
18 C 8 C_8
19 C 9 C_9
Upvotes: 4
Reputation: 11391
Try this:
df['hits'] = df.hits.transform(lambda x: range(1, x))
s = df.apply(lambda x: pd.Series(x['hits']),axis=1).stack().reset_index(level=1, drop=True)
s.name = 'hits'
df = pd.concat([df.City, s], axis=1)
df['hit_id'] = df.City + '_' + df.hits.astype(str)
df.hits
from a column of integers to a column of lists, ranging from 1 to the original value. df.City
column. _
.Upvotes: 0