Pandas: How to print the groupby values

Question

I have a following data set from Table_Record:

Seg_ID  Lock_ID  Code
111     100      1
222     121      2
333     341      2
444     100      1
555     100      1
666     341      2
777     554      4
888     332      5

I am using the sql query to find the Seg_IDs where Lock_ID is repeated:

Select Code,Lock_ID,Seg_ID from Table_Record group by Code, Lock_ID;

Seg_ID  Lock_ID  Code
111     100      1
444     100      1
555     100      1
222     121      2
333     341      2
666     341      2
777     554      4
888     332      5

How can I achieve the same using Pandas?

Excepted O/P from Pandas is:

eg.

Seg_ID (111,444,555) has Lock_id (1).
Seg_ID (222,333,666) has Lock_ID (2).

jezrael · Accepted Answer

First get all codes by filtering only duplicated values and then filter original DaatFrame by boolean indexing with isin:

codes = df.loc[df.duplicated(['Lock_ID']), 'Code'].unique()

df1 = df[df['Code'].isin(codes)]
print (df1)
   Seg_ID  Lock_ID  Code
0     111      100     1
1     222      121     2
2     333      341     2
3     444      100     1
4     555      100     1
5     666      341     2

Then groupby with f-strings:

for k, v in df1.groupby(['Code'])['Seg_ID']:
    print (f'Seg_ID {tuple(v)} has Code ({k})')

Seg_ID (111, 444, 555) has Code (1)
Seg_ID (222, 333, 666) has Code (2)

If want output like DataFrame use apply with tuple:

df2 = df1.groupby(['Code'])['Seg_ID'].apply(tuple).reset_index()
print (df2)
   Code           Seg_ID
0     1  (111, 444, 555)
1     2  (222, 333, 666)

Pandas: How to print the groupby values

Answers (2)

Related Questions