Reputation: 289
I am new to Pandas dataframe and I would like to find common values of 'col2' within multiple groups grouped by 'col1'
col1 col2
a abc
pqr
xyz
b abc
def
bcd
c bcd
efg
The output should be as follows:
abc [a,b]
bcd [b,c]
Can anyone help me with the solution?
Thanks.
Upvotes: 2
Views: 875
Reputation: 862511
Use:
df['col1'] = df['col1'].replace('',np.nan).ffill()
s = df.groupby('col2')['col1'].apply(list)
s = s[s.str.len() > 1].reset_index()
print (s)
col2 col1
0 abc [a, b]
1 bcd [b, c]
Explanation:
replace
empty values to NaN
s and forward fill NaN
scol2
aggregate list
sboolean indexing
Upvotes: 2