How to make a group id using pandas

Question

R's data.table package has a really convenient .GRP method for generating group index values.

library(data.table)
dt <- data.table(
  Grp=c("a", "z", "a", "f", "f"),
  Val=c(3, 2, 1, 2, 2)
)
dt[, GrpIdx := .GRP, by=Grp]

   Grp Val GrpIdx
1:   a   3      1
2:   z   2      2
3:   a   1      1
4:   f   2      3
5:   f   2      3

What's the best way to accomplish the same thing using pandas?

import pandas as pd
df = pd.DataFrame({'Grp':["a", "z", "a", "f", "f"], 'Val':[3, 2, 1, 2, 2]})

Nickil Maveli · Accepted Answer

You could use rank to identify unique groups with the method arg set to dense which accepts string values:

df['GrpIdx'] = df['Grp'].rank(method='dense').astype(int)

How to make a group id using pandas

Answers (2)

Related Questions