Reputation: 943
I have a select statement with more than 10 columns.I have to repeat the rows wherever the data is missing based on the date. The rows which are to be generated should have data from the preceding rows sorted by date ascending. The date range to be considered is based on grouping of id.
The date is actually a range from 15th March to 16th April,but for sample I have taken only limited rows.
For example the data is as shown below.
ID Date Type Code Location
==== ====== === ==== ====
1 15-Mar TG RET X1
1 17-Mar GG CAN S2
1 20-Mar DTR ISS D2
2 14-Apr YT RR F2
2 16-Apr F FC F1
Excepted output:
ID Date Type Code Location
=== ==== ==== ==== ======
1 15-Mar TG RET X1
*1 16-Mar TG RET X1*
1 17-Mar GG CAN S2
*1 18-Mar GG CAN S2*
*1 19-Mar GG CAN S2*
1 20-Mar DTR ISS D2
2 14-Apr YT RR F2
*2 15-Apr YT RR F2*
2 16-Apr F FC F1
Upvotes: 3
Views: 10077
Reputation: 220762
Fear not, you must, when exposed to a problem that can be solved optimally using Oracle's MODEL
clause. The following query will return the wanted result:
SELECT id, d, type, code, location
FROM (
SELECT
id, d, type, code, location,
null min_d,
null max_d
FROM t
UNION ALL
SELECT
id, null, null, null, null,
MIN(d),
MAX(d)
FROM t
GROUP BY id
)
MODEL RETURN UPDATED ROWS
PARTITION BY (id)
DIMENSION BY (d)
MEASURES (type, code, location, min_d, max_d)
RULES (
type [FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(type[cv(d)], type[cv(d) - 1]),
code [FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(code[cv(d)], code[cv(d) - 1]),
location[FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(location[cv(d)], location[cv(d) - 1])
)
ORDER BY id, d
| ID | D | TYPE | CODE | LOCATION |
|----|-------------------------|------|------|----------|
| 1 | March, 15 2015 00:00:00 | TG | RET | X1 |
| 1 | March, 16 2015 00:00:00 | TG | RET | X1 |
| 1 | March, 17 2015 00:00:00 | GG | CAN | S2 |
| 1 | March, 18 2015 00:00:00 | GG | CAN | S2 |
| 1 | March, 19 2015 00:00:00 | GG | CAN | S2 |
| 1 | March, 20 2015 00:00:00 | DTR | ISS | D2 |
| 2 | April, 14 2015 00:00:00 | YT | RR | F2 |
| 2 | April, 15 2015 00:00:00 | YT | RR | F2 |
| 2 | April, 16 2015 00:00:00 | F | FC | F1 |
Think of MODEL
as a SQL spreadsheet language, kind of like Microsoft Excel, but much more powerful - because SQL!
SELECT id, d, type, code, location
FROM (
-- This is your original data, plus two columns
SELECT
id, d, type, code, location,
null min_d,
null max_d
FROM t
UNION ALL
-- This is a utility record containing the MIN(d) and MAX(d) values for
-- each ID partition. We'll use these MIN / MAX values to generate rows
SELECT
id, null, null, null, null,
MIN(d),
MAX(d)
FROM t
GROUP BY id
)
-- We're using the RETURN UPDATED ROWS clause, as we don't want the utility
-- record from above in the results
MODEL RETURN UPDATED ROWS
-- Your requirement is to fill gaps between dates within each id PARTITION
PARTITION BY (id)
-- The dates are your DIMENSION, i.e. the axis along which we're generating rows
DIMENSION BY (d)
-- The remaining rows are the MEASURES, i.e. the calculated values in each "cell"
MEASURES (type, code, location, min_d, max_d)
-- The following RULES are used to generate rows. For each MEASURE, we simply
-- iterate from the MIN(d) to the MAX(d) value, referencing the min_d / max_d
-- values from the utility record above
RULES (
type [FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(type[cv(d)], type[cv(d) - 1]),
code [FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(code[cv(d)], code[cv(d) - 1]),
location[FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(location[cv(d)], location[cv(d) - 1])
)
ORDER BY id, d
Upvotes: 2
Reputation: 8123
Here's a working example of a possible way to achieve your desired output. I'm utilizing Oracle's LAST_VALUE
analytic function with the IGNORE NULLS
option and ORDER BY
clause.
Test data:
CREATE TABLE so123 (
id NUMBER,
d DATE,
type VARCHAR2(10),
code VARCHAR2(10),
location VARCHAR2(10)
);
INSERT INTO so123 VALUES (1, DATE '2015-05-15', 'TG', 'RET', 'X1');
INSERT INTO so123 VALUES (1, DATE '2015-05-17', 'GG', 'CAN', 'S2');
INSERT INTO so123 VALUES (1, DATE '2015-05-20', 'DTR', 'ISS', 'D2');
INSERT INTO so123 VALUES (2, DATE '2015-04-14', 'YT', 'RR', 'F2');
INSERT INTO so123 VALUES (2, DATE '2015-04-16', 'F', 'FC', 'F1');
COMMIT;
The select itself:
WITH
dmm AS (
SELECT MIN(d) min_d, MAX(d) max_d FROM so123
)
SELECT
NVL(s.id, LAST_VALUE(s.id) IGNORE NULLS OVER (ORDER BY dt.d)) AS id,
dt.d,
NVL(s.type, LAST_VALUE(s.type) IGNORE NULLS OVER (ORDER BY dt.d)) AS type,
NVL(s.code, LAST_VALUE(s.code) IGNORE NULLS OVER (ORDER BY dt.d)) AS code,
NVL(s.location, LAST_VALUE(s.location) IGNORE NULLS OVER (ORDER BY dt.d)) AS location
FROM (
SELECT min_d + level - 1 as d
FROM dmm
CONNECT BY min_d + level - 1 <= max_d
) dt LEFT JOIN so123 s ON (dt.d = s.d)
ORDER BY dt.d
;
Output:
ID D TYPE CODE LOCATION
---------- ---------------- ---------- ---------- ----------
2 14-04-2015 00:00 YT RR F2
2 15-04-2015 00:00 YT RR F2
2 16-04-2015 00:00 F FC F1
2 17-04-2015 00:00 F FC F1
2 18-04-2015 00:00 F FC F1
2 19-04-2015 00:00 F FC F1
2 20-04-2015 00:00 F FC F1
2 21-04-2015 00:00 F FC F1
2 22-04-2015 00:00 F FC F1
2 23-04-2015 00:00 F FC F1
2 24-04-2015 00:00 F FC F1
2 25-04-2015 00:00 F FC F1
2 26-04-2015 00:00 F FC F1
2 27-04-2015 00:00 F FC F1
2 28-04-2015 00:00 F FC F1
2 29-04-2015 00:00 F FC F1
2 30-04-2015 00:00 F FC F1
2 01-05-2015 00:00 F FC F1
2 02-05-2015 00:00 F FC F1
2 03-05-2015 00:00 F FC F1
2 04-05-2015 00:00 F FC F1
2 05-05-2015 00:00 F FC F1
2 06-05-2015 00:00 F FC F1
2 07-05-2015 00:00 F FC F1
2 08-05-2015 00:00 F FC F1
2 09-05-2015 00:00 F FC F1
2 10-05-2015 00:00 F FC F1
2 11-05-2015 00:00 F FC F1
2 12-05-2015 00:00 F FC F1
2 13-05-2015 00:00 F FC F1
2 14-05-2015 00:00 F FC F1
1 15-05-2015 00:00 TG RET X1
1 16-05-2015 00:00 TG RET X1
1 17-05-2015 00:00 GG CAN S2
1 18-05-2015 00:00 GG CAN S2
1 19-05-2015 00:00 GG CAN S2
1 20-05-2015 00:00 DTR ISS D2
37 rows selected
How does that work? We generate all the dates between the MIN and MAX dates from the source table. To do that, we use the CONNECT BY
clause to make Oracle generate records until the condition min_d + level - 1 <= max_d
doesn't hold any longer.
Then, we take the generated records and LEFT JOIN
the source table to them. Here comes the LAST_VALUE
analytic function's magic into play. This function searches for the last non-null (the IGNORE NULLS
option) value in the table, using specified ordering and fills in the missing fields.
You can read more about that function here:
http://oracle-base.com/articles/misc/first-value-and-last-value-analytic-functions.php
Upvotes: 2