Reputation: 3515
How can I return a variable (n) number of rows with FETCH FIRST (n) ROWS ONLY
in DB2 SQL..? This is for DB2/400 v7r3.
The documentation for FETCH
does clearly state it cannot be done, but I cannot imagine another way to do it...
...fetch-row-count must not contain a scalar-fullselect, a column reference, a table reference, a user-defined function reference, or a built-in scalar function...
...which I take as meaning it must be a constant such as "10"
, and cannot be a variable or column name, which unfortunately is exactly what I would like to do.
This works:
SELECT PREFILTER.*
FROM PREFILTER
INNER JOIN GT1 ON FILTERED.GTAMT=GT1.LOSTAMT
ORDER BY GTDATE DESC
FETCH FIRST 10 ROWS ONLY
This does not work:
SELECT PREFILTER.*
FROM PREFILTER
INNER JOIN GT1 ON PREFILTER.GTAMT=GT1.LOSTAMT
ORDER BY GTDATE DESC
FETCH FIRST (GT1.LOSTAMT) ROWS ONLY <=== changed here
The following error is received:
SQL Error [428H7]: [SQ20467] Expression containing LOSTAMT must calculate a constant value.
This query is intended to find financial transaction records which are orphaned or keyed incorrectly in a larger table with rather loose normalization & referential integrity (or lack thereof).
PREFILTER
is a query which returns a subset of the main table, and GT1
is another query which calculates a smaller & more complex subset of those records. Then JOIN
them, and return (n)
rows as specified by the column LOSTQTY
, in descending date order. So it should only return the (n) most recent records.
Note that I do recognize my placement of FETCH
is incorrect, and it (or whatever it morphs into) would likely have to move into one of the CTE queries such as GT1
.
Also, at this early stage, it seems apparent I'm getting some Cartesian results, but that will likely be settled once the FETCH
issue is resolved.
For reference, here is the full SQL of the project:
WITH --SET THE INITIAL ACCOUNT & DATE RANGE
PREFILTER AS ( SELECT *
FROM GLTRANT
WHERE GTDATE > 20170000
AND GTACCT=112068
),
--CREATE LIST OF ALL POSITIVE VALUES
POSVALS AS ( SELECT GTAMT AS POSAMT, COUNT(GTAMT) AS POSC
FROM PREFILTER
WHERE GTAMT > 0
GROUP BY GTAMT
),
--CREATE LIST OF ALL NEGATIVE VALUES, WITH SIGN DROPPED
NEGVALS AS ( SELECT ABS(GTAMT) AS NEGAMT, COUNT(GTAMT) AS NEGC
FROM PREFILTER
WHERE GTAMT < 0
GROUP BY ABS(GTAMT)
),
--CALCULATE DISCREPANCIES BETWEEN THE TWO LISTS. SUBTRACT THE TWO AND MULTIPLY THE SIGN BY THE ABSOLUTE VALUE
--OF THE DIFFERENCE. THEN TO RESTORE THE SIGN, MULTIPLY THE AMOUNT BY THE SIGN OF THE DIFFERENCE.
FOJ AS ( SELECT SIGN(COALESCE(POSC,0)-COALESCE(NEGC,0))*COALESCE(POSAMT,NEGAMT) AS LOSTAMT,
ABS (COALESCE(POSC,0)-COALESCE(NEGC,0)) AS LOSTQTY
FROM POSVALS
FULL OUTER JOIN NEGVALS ON POSAMT=NEGAMT
WHERE COALESCE(POSC,0)-COALESCE(NEGC,0) <> 0
),
--GET DISCREPANCIES WITH COUNT >1
GT1 AS ( SELECT *
FROM FOJ
WHERE LOSTQTY>1
)
--SEARCH PREFILTER FOR EACH AMOUNT (LOSTAMT) IN GT1 AND RETURN THE MOST RECENT (LOSTQTY) RECORDS
SELECT PREFILTER.*
FROM PREFILTER
INNER JOIN GT1 ON PREFILTER.GTAMT=GT1.LOSTAMT
ORDER BY GTDATE DESC
FETCH FIRST (GT1.LOSTQTY) ROWS ONLY --DOES NOT WORK
Upvotes: 0
Views: 2429
Reputation: 4532
The reference page does indeed say that you cannot use a column reference, as you attempted to do, as the expression of the number of rows to return.
What it does allow is an expression. There are rules as to what it says is not allowed, but those rules do not exclude the use of a host variable.
FETCH FIRST :xqty ROWS ONLY
Upvotes: 0
Reputation: 3515
Ok, after absorbing Mustaccio's previous answer and banging my head on the IBM documentation all day, I was able to work this out. Much thanks to Mustaccio for pointing me in the right direction.
Two important bits: the location of the ROW_NUMBER() had to move further down; it didn't work for this where it was. Also, it needed the addition of the PARTITION BY
parameter so the numbering restarts with each new group of values. So it goes something like 1-2-1-2-1-2-3 instead of 1-2-3-4-5-6-7. This is critical for WHERE RN <= LOSTQTY
to work.
Here's the full result:
WITH --SET THE INITIAL ACCOUNT & DATE RANGE
PREFILTER AS ( SELECT GTCOMP, GTACCT, GTDATE, GTSRCE, GTREF#, GTENT#, GTAMT, GTDESC, "GTPO#",
"GTCHK#", "GTINV#", GTCKAC, GT1099, GTXXX1, GTAFLG, GTVEND, "GTBAT#"
FROM F_CERTOB.GLTRANT
WHERE GTDATE > 20180000
AND GTACCT=112068
),
--CREATE LIST OF ALL POSITIVE VALUES
POSVALS AS ( SELECT GTAMT AS POSAMT, COUNT(GTAMT) AS POSC
FROM PREFILTER
WHERE GTAMT > 0
GROUP BY GTAMT
),
--CREATE LIST OF ALL NEGATIVE VALUES, WITH SIGN DROPPED
NEGVALS AS ( SELECT ABS(GTAMT) AS NEGAMT, COUNT(GTAMT) AS NEGC
FROM PREFILTER
WHERE GTAMT < 0
GROUP BY ABS(GTAMT)
),
--CALCULATE DISCREPANCIES BETWEEN THE TWO LISTS. SUBTRACT THE "DE-SIGNED" NEGATIVE FROM THE POSITIVE AND
--MULTIPLY THE SIGN BY THE ABSOLUTE VALUE OF THE DIFFERENCE. THEN TO RESTORE THE SIGN, MULTIPLY THE AMOUNT
--BY THE SIGN OF THE DIFFERENCE. THIS IS A FULL OUTER JOIN, SO NULLS ARE A GIVEN, AND COALESCE() IS USED
--TO FILL IN THE HOLES.
FOJ AS ( SELECT SIGN(COALESCE(POSC,0)-COALESCE(NEGC,0))*COALESCE(POSAMT,NEGAMT) AS LOSTAMT,
ABS (COALESCE(POSC,0)-COALESCE(NEGC,0)) AS LOSTQTY
FROM POSVALS
FULL OUTER JOIN NEGVALS ON POSAMT=NEGAMT
WHERE COALESCE(POSC,0)-COALESCE(NEGC,0) <> 0
),
--THIS IS AN EXTRA KNOB TO CONTROL THE NUMBER OF RESULTS DURING DEVELOPMENT. IF SET TO >1, IT WILL SHOW
--ONLY THE DATA THAT WOULD RETURN TWO OR MORE ROWS. USEFUL WHEN 99% OF THE DATA WOULD BE ONE ROW. THIS
--WAS NEEDED TO DEVELEOP & TEST "ROW_NUMBER() OVER (PARTITION BY GTAMT ORDER BY GTDATE DESC) AS RN" AND
--IF THE NUMBERING WAS RESTARTING PROPERLY FOR EACH GROUP OF VALUES.
GT AS ( SELECT *
FROM FOJ
WHERE LOSTQTY>0
),
--RETRIEVE THE ITEMS, RANK THEM BY AMT & DATE. USE PARTITON-BY TO RESTART ROW NUMBERING FOR EACH GROUP
--OF AMOUNTS. USE ORDER-BY TO NUMBER BY DATE IN DESCENDING ORDER.
GROUPED AS ( SELECT PREFILTER.*, LOSTQTY,
ROW_NUMBER() OVER (PARTITION BY GTAMT ORDER BY GTDATE DESC) AS RN
FROM PREFILTER
INNER JOIN GT ON GTAMT=LOSTAMT
),
--NARROW IT DOWN TO ONLY THE TOP (n) ITEMS
RECENT AS ( SELECT *
FROM GROUPED
WHERE RN <= LOSTQTY
)
SELECT *
FROM RECENT
ORDER BY GTAMT
Upvotes: 0
Reputation: 18945
You should be able to use a window function ROW_NUMBER
for this:
WITH --SET THE INITIAL ACCOUNT & DATE RANGE
PREFILTER AS (
SELECT T.*,
ROW_NUMBER() OVER (ORDER BY T.GTDATE DESC) AS RN
FROM (
SELECT *
FROM GLTRANT
WHERE GTDATE > 20170000
AND GTACCT=112068
) T
),
...
--SEARCH PREFILTER FOR EACH AMOUNT (LOSTAMT) IN GT1 AND RETURN THE MOST RECENT (LOSTQTY) RECORDS
SELECT PREFILTER.*
FROM PREFILTER
INNER JOIN GT1 ON PREFILTER.GTAMT=GT1.LOSTAMT
WHERE RN <= GT1.LOSTQTY
ORDER BY GTDATE DESC
It assigns consecutive numbers for each row in PREFILTER
in the descending order of GTDATE
, which you can later use to limit the result set.
Upvotes: 4
Reputation: 1605
I'm not sure you can do what you want in a single statement. However, you should be able to put together a stored procedure or SQL function that can do this instead. In the stored procedure, put together the query and then open it with a cursor. The cursor can then fetch the number of rows you want, put that into a result set, and return that result set from the procedure. Current documentation from IBM on that is here: https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_73/sqlp/rbafyresultsets.htm
Upvotes: 1