Lynn
Lynn

Reputation: 4408

Concatenate two columns values and create consecutive unique ID based on specific column in Python

I have a dataset, df, where I would like to merge two column values into one and then add a consecutive numeric value at the end based on a specific column.

Data

id  date
aa  q122
aa  q122
aa  q122
aa  q222
bb  q122
bb  q122
aa  q222
bb  q222
cc  q122
cc  q222

Desired

id  date    type
aa  q122    aa_q122_1
aa  q122    aa_q122_2
aa  q122    aa_q122_3
aa  q222    aa_q222_1
bb  q122    bb_q122_1
bb  q122    bb_q122_2
aa  q222    aa_q222_2
bb  q222    bb_q222_1
cc  q122    cc_q122_1
cc  q222    cc_q222_1

Doing

The consecutive numbers on the end of the values reset back to 1 when there is a new date value. ex.

aa   q122   will become aa_q122_1,

however, once the date changes from q122 to q222, now the count aa_q122_1



df["type"] = df["id"] + df["date"]

but not sure how to add consecutive values at the end. I am researching this now. Any suggestion is appreciated.

Upvotes: 1

Views: 763

Answers (1)

Anurag Dabas
Anurag Dabas

Reputation: 24324

try via groupby() and cumcount():

df['type']=df['id']+'_'+df['date']+'_'+(df.groupby(['id','date']).cumcount()+1).astype(str)

output of df:

    id  date    type
0   aa  q122    aa_q122_1
1   aa  q122    aa_q122_2
2   aa  q122    aa_q122_3
3   aa  q222    aa_q222_1
4   bb  q122    bb_q122_1
5   bb  q122    bb_q122_2
6   aa  q222    aa_q222_2
7   bb  q222    bb_q222_1
8   cc  q122    cc_q122_1
9   cc  q222    cc_q222_1

Upvotes: 2

Related Questions