Anshul
Anshul

Reputation: 43

Fetching data and sums according to date in postgresql

I am working in Greenplum - postgresql DB and have below structure of data:

enter image description here

In this I need below logic to implement (some of which I already implemented):

CASE WHEN PDATE IS NOT NULL THEN to_char(PDATE,'YYYY-MM-DD') 
WHEN PDATE IS NULL THEN to_char(NDATE,'YYYY-MM-DD N')
WHEN NDATEIS NULL THEN 'NO PO' ELSE 'NO PO' END

According to which I need QTY and VName.

QTY: Sum(Qty) according to min (PDATE and NDATE)

VName: VName according to min (PDATE and NDATE)

DESIRED OUTPUT:

DESIRED OUTPUT

as far I have made below query:

SELECT
    ITEM ,
    MIN(CASE WHEN PDATE IS NOT NULL THEN to_char(PDATE,'YYYY-MM-DD') 
             WHEN PDATE IS NULL THEN to_char(NDATE,'YYYY-MM-DD N')
             WHEN NDATE IS NULL THEN 'NO PO' ELSE 'NO PO' END) AS PRO
FROM
    Table
GROUP BY
    ITEM

Please help me out with the query

Upvotes: 0

Views: 77

Answers (3)

Jon Roberts
Jon Roberts

Reputation: 2106

Anshul, your solution works but it will come with a performance hit as you are joining to your table twice which forces the database to scan your table twice. The better solution is to use an analytical function and only reference the table once.

Here is an example:

    CREATE TABLE anshul
    (
      item character varying,
      pdate date,
      ndate date,
      qty integer,
      vname character varying
    )
    WITH (APPENDONLY=true)
    DISTRIBUTED BY (item);

    INSERT INTO ANSHUL VALUES 
    ('ABC', NULL, '2015-12-31', 10, 'Y JACK SOLLEN'),
    ('HRD', '2016-01-29', '2016-1-8', 5, 'H HARRIS'),
    ('HRD', '2015-09-07', '2015-10-09', 31, 'G JOE'),
    ('HRD', '2015-09-30', '2015-09-07', 28, 'K KAMATH'),
    ('GGT', '2015-12-10', '2015-12-12', 10, 'P QUIK'),
    ('GGT', '2015-12-27', NULL, 20, NULL),
    ('GGT', '2015-12-10', '2016-01-04', 22, 'U RITZ'),
    ('GGT', '2016-01-07', '2016-01-07', 22, 'S SUE DAL'),
    ('OWE', NULL, '2015-12-22', 6, 'J JASON NIT'),
    ('OWE', NULL, '2015-11-05', 2, 'P QUEER'),
    ('OWE', NULL, '2015-11-05', 5, 'K KITTAN');

And here is the query which borrows some of the code you already had figured out.

    SELECT item,
            sum(qty) AS qty,
            array_to_string(array_agg(vname), ',') AS vname
    FROM    (
            SELECT  item, 
                    rank() OVER(PARTITION BY item ORDER BY desired_date) AS rank,
                    qty,
                    vname
            FROM    (SELECT item,
                            qty,
                            vname,
                            CASE WHEN PDATE IS NOT NULL THEN pdate
                            WHEN PDATE IS NULL THEN ndate END AS desired_date
                    FROM anshul
                    ) AS sub1
            ) AS sub
    WHERE sub.rank = 1
    GROUP BY item
    ORDER BY item;

And the results:

     item | qty |      vname       
    ------+-----+------------------
     ABC  |  10 | Y JACK SOLLEN
     GGT  |  32 | P QUIK,U RITZ
     HRD  |  31 | G JOE
     OWE  |   7 | K KITTAN,P QUEER

Upvotes: 0

Anshul
Anshul

Reputation: 43

Thanks Tim for your help.. It took me some time to create the query, but in the end its completed.. To save the time I posted the question in forum, which ended the same to me - It took time..

Well here's the query

SELECT
    FO.ID ,
    (CASE WHEN FO.DateQ IS NOT NULL THEN to_char(FO.DateQ ,'YYYY-MM-DD') 
          WHEN FO.DateQ IS NULL THEN to_char(FO.Datew ,'YYYY-MM-DD N') 
          WHEN FO.Datew IS NULL AND FO.DateQ IS NULL THEN 'NO PO' END) AS DATER ,
    FO.QTY ,
    FO.VNAME
FROM
    (
        SELECT
            NT.ID ,
            PT.DATEQ ,
            PT.DATEW ,
            SUM(NT.QTY) AS QTY ,
            array_to_string(array_agg(NT.VNAME) ,', ') AS VNAME
        FROM
            TABLENAME NT INNER JOIN(
                SELECT
                    AST.ID ,
                    AST.DateQ ,
                    (CASE WHEN AST.DateQ IS NULL THEN AST.DateW ELSE NULL END) AS DateW
                FROM
                    (
                        SELECT
                            ID ,
                            MIN(PDATE) AS DATEQ ,
                            MIN(CASE WHEN pdate IS NULL THEN ndate END) DATEW
                        FROM
                            TABLENAME
                        GROUP BY
                            ID
                    ) AST
            ) PT
                ON NT.ID = PT.ID
            AND NT.PDATE = PT.DATEQ
            OR NT.NDATE = PT.DATEW
        GROUP BY
            NT.ID ,
            PT.DATEQ ,
            PT.DATEW
    ) FO
ORDER BY
    FO.ID

Consider ID as Item.

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520968

The following answer assumes you are using Postgres version 8.4 or later, which ships with an aggregate function called ARRAY_AGG(). I use ARRAY_AGG() to create comma-separated lists for the VNAME for each ITEM-MIN(DATE) group.

SELECT t1.ITEM, t1.DATE, t1.QTY, t1.VNAME
FROM
(
    SELECT t.ITEM, t.DATE AS DATE, SUM(t.QTY) AS QTY, ARRAY_AGG(VNAME) AS VNAME
    FROM
    (
        SELECT t.ITEM, LEAST(t.PDATE, t.NDATE) AS DATE, t.QTY, t.VNAME
        FROM Table t
    ) t
    GROUP BY t.ITEM, t.DATE
) t1
INNER JOIN
(
    SELECT t.ITEM, MIN(LEAST(t.PDATE, t.NDATE)) AS DATE
    FROM Table t
    GROUP BY t.ITEM
) t2
ON t1.ITEM = t2.ITEM AND t1.PDATE = t2.PDATE

Explanation:

The first query obtains the QTY sums and VNAME CSV aggregates for each ITEM-PDATE combination. However, you only want the aggregates from the earliest date in each of these groups. The second query, to which the first query is joined, filters off the unwanted groups, leaving you with the result you want.

Since you are using Postgres 8.2, you will have to define your own custom function:

CREATE AGGREGATE ARRAY_AGG (anyelement)
(
    sfunc = array_append,
    stype = anyarray,
    initcond = '{}'
);

Upvotes: 0

Related Questions