Mrye
Mrye

Reputation: 727

Pandas : Group by similarities

Currently I have this kind of data :

Item    Properties
A   C001
A   C002
A   C003
B   C001
B   C003
C   C001

I want to group those items into something like this

A   C001, C002, C003
B   C001, C003
C   C001

And then, I want to match those item based on properties similarities:

A   B   2
A   C   1
B   C   1

How can I modify this dataframe using pandas ? I did use groupby method but it display number of properties instead of array of properties name.

Upvotes: 2

Views: 1468

Answers (1)

user5497218
user5497218

Reputation: 26

import pandas as pd

selfjoin = pd.merge(df, df, on = 'Property')
similarity = selfjoin.groupby(('Item_x', 'Item_y'), as_index=False).size()

Upvotes: 1

Related Questions