Linu
Linu

Reputation: 597

How to find cross selling products in postgres

I have to create a report which shows the cross selling of a product ,Means if i have a product X i need to find the combination of this product which is purchased with other products and it's count.

So i have the table structure as follows,

Below is the same data.Reference is the order number and for each item there is a separate row which shows the product and category details.

Reference.      Product Name.        Prod ID.       Category
1000001         Honda x12            10023           Machinery
1000001         Honda cv12           10025           Machinery
1000002         Medic. 12x           10026           Medicine
1000002         Honda x12            10023           Machinery
1000003         Honda x12            10023           Machinery
1000004         Appliance x12        10033           Household
1000004         Honda x12            10023           Machinery
1000005         Bag x234             100265          Bags

I want the output to be like, Suppose i want to find the cross sell products for Honda x12, means i wan to know which all are the products sold in combination with Honda x12 and number of occurrences that particular combination count occured.

Can anyone suggest me how i can do this in PostgreSQL(Version 11).

Thanks in advance

Upvotes: 0

Views: 445

Answers (1)

GMB
GMB

Reputation: 222502

I think that's a self-join with an inequality condition:

select t.prod_id prod_id1, x.prod_id prod_id2, count(*) cnt
from mytable t
inner join mytable x 
    on  x.reference = t.reference
    and x.prod_id > t.prod_id
group by t.prod_id, x.prod_id
order by 1, 2

> is on purpose in the join predicate instead of <>, to avoid "mirror" records in the result set.

For your sample data, this generates:

prod_id1 | prod_id2 | cnt
-------: | -------: | --:
   10023 |    10025 |   1
   10023 |    10026 |   1
   10023 |    10033 |   1

This gives you the results for all products at once. If you want the list of "pairs" of a given product only, then it is slightly different:

select t.prod_id, count(*) cnt
from mytable t
inner join mytable x 
    on  x.reference = t.reference
    and x.prod_id <> t.prod_id
where x.prod_id = 10023
group by t.prod_id
order by 1

Demo:

prod_id | cnt
------: | --:
  10025 |   1
  10026 |   1
  10033 |   1

Upvotes: 3

Related Questions