Reputation: 7644
i have a table in pandas df
product_id_x product_id_y
1 2
1 3
1 4
3 7
3 11
3 14
3 2
and so on around (1000 rows)
i want to find the count of combinations for each product_id_x with product_id_y.
ie. 1 has combinations with 1-2,1-3,1-4(total 3 combinations) similarly 3 has total 4 combinations.
and create a dataframe df2 which has
product_id_x combinations
1 3
3 4
and so on ..(distinct product_id_x's)
what approach should i follow? my skills on python are at a beginner level. Thanks in advance.
Upvotes: 1
Views: 1037
Reputation: 294218
size
counts the number of rows each of the column value pairs happen together. count
counts the same thing but where they are not null. Since you did not mention anything about nulls, I'll use size
after a groupby
, then unstack
df.groupby(['product_id_x', 'product_id_y']).size().unstack(fill_value=0)
Upvotes: 2
Reputation: 19811
You can use groupby
with agg
on product_id_x
column:
df2 = df.groupby(['product_id_x']).agg(['count'])
Or, you can directly use size
function on the group to get size of each group:
df2 = df.groupby(['product_id_x']).size()
Upvotes: 2