Patthebug
Patthebug

Reputation: 4787

Increment a column value based on a combination of columns

I have a dataset that looks like this:

OwnerID    GroupID    AssignmentID   ... <few more columns>  [Need this column]
1           10         100                                    1
1           10         100                                    1
1           10         200                                    2
1           20         100                                    1
1           20         200                                    2
1           20         300                                    3
2           30         200                                    1
2           30         200                                    1
2           40         300                                    2

I would like to populate a column based on the values in OwnerID, GroupID and AssignmentID fields. If the values in these fields are the same across rows, then I want the number 1 repeated in the new column. But if the same owner has assigned a different assignment to the same group, then the value in the new column should increment.

For example - OwnerID 1 assigned 2 assignments (2 with the same AssignmentID 100 and the other with AssignmentID 200). The AssignmentID 100 gets value 1 in both cases because the values for OwnerID, GroupID and AssignmentID are the same, but gets the value 2 when the AssignmentID is 200.

Similarly, when OwnerID 100 assigned AssignmentID 100, 200 and 300, the group that these assignments were assigned to, had changed to 20.

I thought this could be done using the following code:

AssignmentDetails['colname'] = AssignmentDetails.groupby(['ownerid','groupid','assignmentid']).cumcount()

But this doesn't give me the required result. It doesn't repeat the value in the new column when the values in the 'groupby' clause are the same, but it increments the values.

How do I achieve this? Any help would be great.

Upvotes: 2

Views: 139

Answers (2)

BENY
BENY

Reputation: 323226

Or

df.groupby([ 'OwnerID' , 'GroupID' ]).AssignmentID.transform(lambda x: x.astype('category').cat.codes.add(1))
Out[186]: 
0    1
1    1
2    2
3    1
4    2
5    3
6    1
7    1
8    1
Name: AssignmentID, dtype: int8

Upvotes: 1

piRSquared
piRSquared

Reputation: 294258

df.assign(
    result=df.groupby(
        ['OwnerID', 'GroupID']
    ).AssignmentID.transform(lambda x: x.factorize()[0]) + 1
)

   OwnerID  GroupID  AssignmentID  Result  result
0        1       10           100       1       1
1        1       10           100       1       1
2        1       10           200       2       2
3        1       20           100       1       1
4        1       20           200       1       2
5        1       20           300       1       3
6        2       30           200       1       1
7        2       30           200       1       1
8        2       40           300       2       1

Upvotes: 2

Related Questions