Reputation: 597
I have to create a report which shows the cross selling of a product ,Means if i have a product X i need to find the combination of this product which is purchased with other products and it's count.
So i have the table structure as follows,
Below is the same data.Reference is the order number and for each item there is a separate row which shows the product and category details.
Reference. Product Name. Prod ID. Category
1000001 Honda x12 10023 Machinery
1000001 Honda cv12 10025 Machinery
1000002 Medic. 12x 10026 Medicine
1000002 Honda x12 10023 Machinery
1000003 Honda x12 10023 Machinery
1000004 Appliance x12 10033 Household
1000004 Honda x12 10023 Machinery
1000005 Bag x234 100265 Bags
I want the output to be like, Suppose i want to find the cross sell products for Honda x12, means i wan to know which all are the products sold in combination with Honda x12 and number of occurrences that particular combination count occured.
Can anyone suggest me how i can do this in PostgreSQL(Version 11).
Thanks in advance
Upvotes: 0
Views: 445
Reputation: 222502
I think that's a self-join with an inequality condition:
select t.prod_id prod_id1, x.prod_id prod_id2, count(*) cnt
from mytable t
inner join mytable x
on x.reference = t.reference
and x.prod_id > t.prod_id
group by t.prod_id, x.prod_id
order by 1, 2
>
is on purpose in the join predicate instead of <>
, to avoid "mirror" records in the result set.
For your sample data, this generates:
prod_id1 | prod_id2 | cnt -------: | -------: | --: 10023 | 10025 | 1 10023 | 10026 | 1 10023 | 10033 | 1
This gives you the results for all products at once. If you want the list of "pairs" of a given product only, then it is slightly different:
select t.prod_id, count(*) cnt
from mytable t
inner join mytable x
on x.reference = t.reference
and x.prod_id <> t.prod_id
where x.prod_id = 10023
group by t.prod_id
order by 1
Demo:
prod_id | cnt ------: | --: 10025 | 1 10026 | 1 10033 | 1
Upvotes: 3