Novice
Novice

Reputation: 145

Oracle analytics queries

I have these three tables:

create table CUSTOMER(
                id integer not null primary key,
                name varchar(255) not null
);

create table PRODUCT(
                id integer not null primary key,
                name varchar(255)  not null
);

create table INVOICE(
                invoice_number varchar(20) not null primary key,
                invoice_date date not null,
                customer_id integer not null,
                product_id integer not null,
                quantity integer,
                summa numeric(13,2)
);

ALTER TABLE invoice ADD CONSTRAINT FK_invoice_customer FOREIGN KEY (customer_id)
REFERENCES customer(id);
ALTER TABLE invoice ADD CONSTRAINT FK_invoice_product FOREIGN KEY (product_id)
REFERENCES product(id);

I need to perform two queries against those:

Query one:

Names of customers that have bought an item in 2015/01, 2015/02, 2015/03 (at least once per every month mentioned) but hasn't bought the same item in 2015/04.

What I have tried is something like this:

SELECT * 
FROM invoice i, customer c, product p
 WHERE i.customer_id = c.id 
AND i.product_id = p.id
AND invoice_date BETWEEN '01/01/2015' AND '03/01/2015'
MINUS
SELECT * 
FROM invoice i, customer c, product p 
WHERE i.customer_id = c.id 
AND i.product_id = p.id AND invoice_date BETWEEN '01/04/2015' AND '04/30/2015';

Trying to find all customers that have bought something between 01/01/2015 and 03/01/2015 ( the date format used is mm/dd/yyyy) and who have bought something in between 04/01/2015 and 04/30/2015. Which, according to my knowledge should move at least somewhat in the right direction, but like this, I unable to check if the customer has bought an item once in every month or just once in the three month period.

Query two:

Should find customers with similar behaviour, pairs of customers who have bought the same item in same quantity in a given period (e.g one month), the quantity may vary in 5% margin (+- 5%).

Thank you all.

Upvotes: 0

Views: 174

Answers (2)

dnoeth
dnoeth

Reputation: 60462

This will return the customers who bought the same item during three months, but not in th 4th month:

SELECT *
FROM 
 ( SELECT customer_id, product_id
   FROM invoice
   WHERE invoice_date BETWEEN DATE '2015-01-01' AND DATE '2015-04-30' -- data for 4 months
   GROUP BY customer_id, product_id
   HAVING COUNT(DISTINCT EXTRACT(MONTH FROM invoice_date)) = 3        -- at least one per month
     AND MAX(invoice_date) < DATE '2015-04-01'                        -- none in april 
 ) i 
JOIN customer c 
  ON i.customer_id = c.id
JOIN product p
  ON i.product_id = p.id;

Based on your comment this was not the correct answer. A customer might have bought any combination of items in the first three months and didn't buy none of those items in the 4th months (but other items). This should return the correct answer:

WITH cte AS 
 ( SELECT customer_id, product_id, 
      -- number of months with buys per customer
      COUNT(DISTINCT EXTRACT(MONTH FROM invoice_date))
      OVER (PARTITION BY customer_id) AS cnt
    FROM invoice
    WHERE invoice_date BETWEEN DATE '2015-01-01' AND DATE '2015-03-31'
 )
SELECT DISTINCT customer_id 
FROM cte
WHERE cnt = 3  -- at least one buy per month
AND NOT EXISTS -- product wasn't bought by customer in april
 ( SELECT * FROM invoice i
   WHERE i.invoice_date BETWEEN DATE '2015-04-01' AND DATE '2015-04-30'
   AND i.customer_id = cte.customer_id
   AND i.product_id = cte.product_id
 )

Instead of EXTRACT(MONTH FROM invoice_date) you could use TRUNC(invoice_date, 'mon', but I prefer Standard SQL syntax.

And this will return your 2nd result:

WITH cte AS
 ( -- data from one month
   SELECT *
   FROM invoice 
   WHERE invoice_date BETWEEN DATE '2015-02-01' AND DATE '2015-02-28'
 )
SELECT DISTINCT t1.customer_id, t2.customer_id, t1.product_id -- need DISTINCT because there might be multiple rows per product/customer
FROM cte t1 JOIN cte t2
  ON t1.product_id = t2.product_id    -- same product
  AND t1.customer_id < t2.customer_id -- different customers
WHERE t1.quantity BETWEEN t2.quantity / 1.05 AND t2.quantity * 1.05

You need to add join this result back to customer and product for more details.

Upvotes: 1

APC
APC

Reputation: 146209

You need to know that we can use TRUNC() on Oracle dates; depending on the format mask we can get the first day of the year or the first day of the month. This query uses both tricks to generate a list of months and join it to invoices. It also uses a cross join on customers, to produce a matrix of customer months.

So now we know whether customers bought something in each month:

select months.mm
       , c.id as customer_id
       , nvl2(max(i.invoice_number), 'Y', 'N')  as bought_something
from ( select add_months(trunc(sysdate, 'yyyy'), level-1) as mm
                 from dual
                 connect by level <= 4 ) months
     cross join customer c
     left outer join invoice i
     on months.mm = trunc(i.invoice_date, 'MM')
        and c.id = i.customer_id
group by months.mm, c.id 

We can feed this result into another query:

with mtrx as (
    select months.mm
           , c.id as customer_id
           , nvl2(max(i.invoice_number), 'Y', 'N')  as bought_something
    from ( select add_months(trunc(sysdate, 'yyyy'), level-1) as mm
                     from dual
                     connect by level <= 4 ) months
         cross join customer c
         left outer join invoice i
         on months.mm = trunc(i.invoice_date, 'MM')
            and c.id = i.customer_id
    group by months.mm, c.id 
    ) 
select customer_id from mtrx where mm = date '2015-01-01' and bought_something = 'Y'
intersect
select customer_id from mtrx where mm = date '2015-02-01' and bought_something = 'Y'
intersect
select customer_id from mtrx where mm = date '2015-03-01' and bought_something = 'Y'
intersect
select customer_id from mtrx where mm = date '2015-04-01' and bought_something = 'N'
;

This may not me the "analytical" solution your professor is expecting but it does produce the right result. Find the obligatory SQL Fiddle here.

Tweaking the final result to get the Customer name is left as an exercise for the reader. Likewise refining the result set to the same item in each month :)

Upvotes: 0

Related Questions