How to index unique combination of values with pandas

Question

I am working with school schedule data and I have to differentiate different sessions of the same course.

If a different class has the same course, this is in effect another session of the same course and needs to be differentiated. This means having an extra column with a session index.

import pandas as pd

cols = ['course', 'class_name', 'professor']

data = [ ['Math',    'X', 'Bob'],
         ['Math',    'X', 'Bob'], 
         ['Math',    'Y', 'Bob'],
         ['English', 'Y', 'Tim'],
         ['English', 'X', 'Jim'],
         ['English', 'X', 'Jim'],
       ]

df = pd.DataFrame(columns=cols, data=data)

# Add session
df['session'] = '?'
print(df)

The result should be something like this.

    course  class_name  professor   session
0   Math    X   Bob     0
1   Math    X   Bob     0
2   Math    Y   Bob     1
3   Eng.    Y   Tim     1
4   Eng.    X   Jim     0
5   Eng.    X   Jim     0

I have come up with a convoluted procedural solution, what would be a more pandas way of doing this?

groups = df.groupby(['course', 'class_name'])

d_sessions = {}
counter = 0
pclass = ""
pcourse = ""

for m_idx in list(groups.groups):
    course = m_idx[0]
    class_ = m_idx[1]

    if class_ != pclass:
        counter += 1

    if pcourse != course:
        counter = 0

    pclass = class_
    pcourse = course    
    d_sessions[m_idx] = counter

df.set_index(['course', 'class_name'], inplace=True)

for k, v in d_sessions.items():
    df.set_value(col='index', value=v, index=k)

df.reset_index(inplace=True)
df

Scott Boston · Accepted Answer

Let's try:

df['session'] = df.groupby('course')['class_name'].transform(lambda x: (~x.duplicated()).cumsum())

Output:

    course class_name professor  session
0     Math          X       Bob        1
1     Math          X       Bob        1
2     Math          Y       Bob        2
3  English          Y       Tim        1
4  English          X       Jim        2
5  English          X       Jim        2

How to index unique combination of values with pandas

Answers (2)

Related Questions