Reputation: 4408
I have a dataset, df, where I would like to merge two column values into one and then add a consecutive numeric value at the end based on a specific column.
Data
id date
aa q122
aa q122
aa q122
aa q222
bb q122
bb q122
aa q222
bb q222
cc q122
cc q222
Desired
id date type
aa q122 aa_q122_1
aa q122 aa_q122_2
aa q122 aa_q122_3
aa q222 aa_q222_1
bb q122 bb_q122_1
bb q122 bb_q122_2
aa q222 aa_q222_2
bb q222 bb_q222_1
cc q122 cc_q122_1
cc q222 cc_q222_1
Doing
The consecutive numbers on the end of the values reset back to 1 when there is a new date value. ex.
aa q122 will become aa_q122_1,
however, once the date changes from q122 to q222, now the count aa_q122_1
df["type"] = df["id"] + df["date"]
but not sure how to add consecutive values at the end. I am researching this now. Any suggestion is appreciated.
Upvotes: 1
Views: 763
Reputation: 24324
try via groupby()
and cumcount()
:
df['type']=df['id']+'_'+df['date']+'_'+(df.groupby(['id','date']).cumcount()+1).astype(str)
output of df
:
id date type
0 aa q122 aa_q122_1
1 aa q122 aa_q122_2
2 aa q122 aa_q122_3
3 aa q222 aa_q222_1
4 bb q122 bb_q122_1
5 bb q122 bb_q122_2
6 aa q222 aa_q222_2
7 bb q222 bb_q222_1
8 cc q122 cc_q122_1
9 cc q222 cc_q222_1
Upvotes: 2