Shubham R
Shubham R

Reputation: 7644

total no. of combinations of a column with other in pandas df

i have a table in pandas df

 id_x             id_y
  a                 b
  b                 c
  c                 d
  d                 a
  b                 a
and so on around (1000 rows)

i want to find the count of combinations for each id_x with id_y.

ie. a has combinations with a-b,d-a(total 2 combinations) similarly b has total 2 combinations(b-c) and also a-b to be considered as a combination for b( a-b = b-a)

and create a dataframe df2 which has

id   combinations
a          2
b          2
c          2    #(c-d and b-c)
d          1
and so on ..(distinct product_id_'s)

i tried doing this code

df.groupby(['id_x']).size().reset_index()

but getting wrong result;

   id_x  0
0   a    1
1   b    1
2   c    1
3   d    1

what approach should i follow? my skills on python are at a beginner level. Thanks in advance.

Upvotes: 0

Views: 63

Answers (1)

jezrael
jezrael

Reputation: 862511

You can first sort all rows by apply sorted, then create Series by stack and last value_counts:

df = df.apply(sorted,axis=1).drop_duplicates().stack().value_counts()
print (df)
d    2
a    2
b    2
c    2
dtype: int64

Upvotes: 2

Related Questions