user2007506
user2007506

Reputation: 79

Generating a count of shared IDs between a tuple of item numbers

I have a group by object converted in to dataframe as :

Item       ID's

1           100,101,102
2           101,103,104
3           100,201,202

Now I want to generate 2-tuples/ordered pairs, that gives me count of the IDs shared in each pair of items. The desired output is:

Item  Item  Id's
 1    2      5
 1    3      5
 2    3      6

The columns correspond to every ordered pair of items such as (1,2),(1,3),(2,3) and so on and then the third column tells me how many ids both items have in the original table.

Upvotes: 1

Views: 95

Answers (1)

pavel_form
pavel_form

Reputation: 1790

If for structure you have something like:

data = {
    1: [100, 101, 102],
    2: [101, 103, 104],
    3: [100, 201, 202],
}

then this will help:

res = {}
items = data.keys()
for n, i in enumerate(items, start=1):
    for j in items[n:]:
        res[(i, j)] = len(
            set(data[i]).union(set(data[j]))
        )

Output:

res
{(1, 2): 5, (1, 3): 5, (2, 3): 6}

Upvotes: 3

Related Questions