Reputation: 79
I have a group by object converted in to dataframe as :
Item ID's
1 100,101,102
2 101,103,104
3 100,201,202
Now I want to generate 2-tuples/ordered pairs, that gives me count of the IDs shared in each pair of items. The desired output is:
Item Item Id's
1 2 5
1 3 5
2 3 6
The columns correspond to every ordered pair of items such as (1,2),(1,3),(2,3) and so on and then the third column tells me how many ids both items have in the original table.
Upvotes: 1
Views: 95
Reputation: 1790
If for structure you have something like:
data = {
1: [100, 101, 102],
2: [101, 103, 104],
3: [100, 201, 202],
}
then this will help:
res = {}
items = data.keys()
for n, i in enumerate(items, start=1):
for j in items[n:]:
res[(i, j)] = len(
set(data[i]).union(set(data[j]))
)
Output:
res
{(1, 2): 5, (1, 3): 5, (2, 3): 6}
Upvotes: 3