Reputation: 13

PostgreSQL How find completely identical orders on set of items and their quantity?

I have a table with orders (for each order (transaction_id) there are client_id and item_id and quantity) I need to find completely identical orders for a set of goods and their quantity in the order

CREATE TABLE IF NOT EXISTS orders (client_id varchar(10),
            item_id varchar(10), quantity int, transaction_id varchar(10));
INSERT INTO orders values 
        ('CL1111','111',1, '1001'),
        ('CL1111','222',2,'1001'),
        ('CL1111','333',1,'1001'),
        ('CL2222','111',2,'1002'),
        ('CL2222','222',1,'1002'),
        ('CL2222','333',1,'1002'),
        ('CL3333','111',1,'1003'),
        ('CL3333','222',2,'1003'),
        ('CL3333','333',1,'1003'),
        ('CL3333','444',1,'1003'),
        ('CL4444','111',1,'1004'),
        ('CL4444','222',2,'1004'),
        ('CL4444','333',1,'1004'),
        ('CL5555','111',1,'1005'),
        ('CL5555','222',2,'1005'),
        ('CL6666','111',1,'1006'),
        ('CL6666','222',2,'1006'),
        ('CL6666','333',1,'1007')

client_id	item_id	quantity	transaction_id
CL1111	111	1	1001
CL1111	222	2	1001
CL1111	333	1	1001
CL2222	111	2	1002
CL2222	222	1	1002
CL2222	333	1	1002
CL3333	111	1	1003
CL3333	222	2	1003
CL3333	333	1	1003
CL3333	444	1	1003
CL4444	111	1	1004
CL4444	222	2	1004
CL4444	333	1	1004
CL5555	111	1	1005
CL5555	222	2	1005
CL6666	111	1	1006
CL6666	222	2	1006
CL6666	333	1	1007

Identical orders here are: (1001 and 1004), (1005 and 1006) And now I don't know, how to leave orders, which completely identical on item_id, quantity and set of items.

Upvotes: 1

Answers (4)

Virus Scan

Reputation: 13

select
     transaction_id
from
     (
     select
        transaction_id,
        ai,
        aq,
        count(1) over (partition by ai, aq) cnt
     from
          (
          select
               transaction_id,
               array_agg(array[item_id]) ai,
               array_agg(array[quantity]) aq
          from orders
          group by transaction_id
          ) a
     ) a
where cnt > 1

Upvotes: 0

ValNik

Reputation: 5633

Try this query. Order clause in aggregation is necessary for reliable result.

with transactions as(
select client_id,transaction_id
   ,string_agg( (item_id ||'('||cast(quantity as varchar) || ')') ,',' order by item_id) itemlist
from orders
group by client_id,transaction_id
)
select * 
  ,row_number()over(partition by itemlist order by client_id,transaction_id) rn
  ,count(*)over(partition by itemlist) cnt
from transactions
order by itemlist

Rows with cnt>1 is has identical row (identical items with same quantity).

With test data, find 2 identical groups for transaction_id 1001-1004 and 1005-1006 (not green in your picture)

result is

client_id	transaction_id	itemlist	rn	cnt
CL5555	1005	111(1),222(2)	1	2
CL6666	1006	111(1),222(2)	2	2
CL1111	1001	111(1),222(2),333(1)	1	2
CL4444	1004	111(1),222(2),333(1)	2	2
CL3333	1003	111(1),222(2),333(1),444(1)	1	1
CL2222	1002	111(2),222(1),333(1)	1	1
CL6666	1007	333(1)	1	1

Test data

INSERT INTO orders (client_id,item_id,transaction_id,quantity)
values 
        ('CL1111','111','1001',1),
        ('CL1111','222','1001',2),
        ('CL1111','333','1001',1),
        ('CL2222','111','1002',2),
        ('CL2222','222','1002',1),
        ('CL2222','333','1002',1),
        ('CL3333','111','1003',1),
        ('CL3333','222','1003',2),
        ('CL3333','333','1003',1),
        ('CL3333','444','1003',1),
        ('CL4444','111','1004',1),
        ('CL4444','222','1004',2),
        ('CL4444','333','1004',1),
        ('CL5555','111','1005',1),
        ('CL5555','222','1005',2),
        ('CL6666','111','1006',1),
        ('CL6666','222','1006',2),
        ('CL6666','333','1007',1)
;

Upvotes: 0

Ponder Stibbons

Reputation: 14848

Please check this query, it returns correct rows in dbfiddle

with agg as (
  SELECT client_id, 
         array_agg(array[item_id]) ai, array_agg(array[quantity]) aq
  from orders group by client_id 
  having count(distinct transaction_id) = 1 )
select client_id, ai items, aq quantities 
from (
  select client_id, ai , aq , count(1) over (partition by ai, aq) cnt
  from agg ) c
where cnt > 1

client_id	items	quantities
CL1111	{{111},{222},{333}}	{{1},{2},{1}}
CL4444	{{111},{222},{333}}	{{1},{2},{1}}

I am not sure what about clients having several transactions, but in your examples clients CL5555 and CL6666 are not matched even though they have same transactions 1005 and 1006. So here only clients having one distinct transaction are compared.

Upvotes: 0

andexte

Reputation: 311

Your example code doesn't include quantity! Here is a code that results as many rows as the table contains the same orders:

sql = """
    with orders_mod as
    (
        select distinct
                "transaction_id",
                string_agg("item_id" || ' ' || "quantity"::text, ',') over (partition by "client_id") as "list_of_ordered_item_ids"
        from    orders
    )
    select distinct
            string_agg("transaction_id", ',') over (partition by "list_of_ordered_item_ids") as "list_of_the_same_orders",
            "list_of_ordered_item_ids"
    from    orders_mod
    ;
"""

df_SQL = pd.read_sql(sql, con)
df_SQL

Upvotes: 0

PostgreSQL How find completely identical orders on set of items and their quantity?

Answers (4)

Related Questions