Reputation: 7644
i have a table in pandas df
id_x id_y
a b
b c
c d
d a
b a
and so on around (1000 rows)
i want to find the count of combinations for each id_x with id_y.
ie. a has combinations with a-b,d-a(total 2 combinations)
similarly b has total 2 combinations(b-c) and also a-b to be considered as a combination for b( a-b = b-a)
and create a dataframe df2 which has
id combinations
a 2
b 2
c 2 #(c-d and b-c)
d 1
and so on ..(distinct product_id_'s)
i tried doing this code
df.groupby(['id_x']).size().reset_index()
but getting wrong result;
id_x 0
0 a 1
1 b 1
2 c 1
3 d 1
what approach should i follow? my skills on python are at a beginner level. Thanks in advance.
Upvotes: 0
Views: 63
Reputation: 862511
You can first sort all rows by apply
sorted
, then create Series
by stack
and last value_counts
:
df = df.apply(sorted,axis=1).drop_duplicates().stack().value_counts()
print (df)
d 2
a 2
b 2
c 2
dtype: int64
Upvotes: 2