Marcelo BD
Marcelo BD

Reputation: 289

Get common values within multiple groupby in pandas Dataframe

I am new to Pandas dataframe and I would like to find common values of 'col2' within multiple groups grouped by 'col1'

 col1    col2
  a       abc
          pqr
          xyz

  b       abc      
          def
          bcd

  c       bcd
          efg

The output should be as follows:

     abc      [a,b]
     bcd      [b,c]

Can anyone help me with the solution?

Thanks.

Upvotes: 2

Views: 875

Answers (1)

jezrael
jezrael

Reputation: 862511

Use:

df['col1'] = df['col1'].replace('',np.nan).ffill()

s = df.groupby('col2')['col1'].apply(list)
s = s[s.str.len() > 1].reset_index()
print (s)
  col2    col1
0  abc  [a, b]
1  bcd  [b, c]

Explanation:

  1. First replace empty values to NaNs and forward fill NaNs
  2. For each value of col2 aggregate lists
  3. Filter lists by lengths by boolean indexing

Upvotes: 2

Related Questions