Reputation: 7493
Background: I need a way to count the number of records that we will be running through our Archive process. The query I have below what I have come up with but I feel like there has to be a better way (Takes about 20minutes to run for the last 90 days)
The Query: We have a table called Table_Main (Not its real name) which contains alot of records (300 million +) and is constantly being added to (Hence the archiving). The table contains parent records that have a TR_ID, which links them to another table, while the children do not have a TR_ID. The children are attached to the parents via a Parent_ID. The tricky part here is that I need to count the records based of values contained in the Trans table which is linked via TR_ID.
Select B.Some_date, B.xx_ID, count(*)
From
(
select tr.Some_date, tr.xx_ID, p.H_ID
from Table_Main p, Trans tr
where p.TR_ID = tr.TR_ID
and tr.Some_date>sysdate-90
--
UNION ALL
--
Select Result.Some_date , Result.xx_ID, th.H_ID
from (
select tr.Some_date, tr.xx_ID, p.H_ID
from Table_Main p, Trans tr
where p.TR_ID = tr.TR_ID and tr.Some_date>sysdate-90
) Result
inner join Table_Main th on th.Parent_id = Result.H_ID and th.Parent_ID is not null
) B
group by B.Some_date, B.xx_ID
order by B.Some_date;
Question/Idea: Is there any way this could be simplified with one join between Table_Main and itself? For example a join that keeps the first record from the parent plus all the joined children? I was trying something like the query below but not getting anywhere.
Select a.Some_date, A.xx_ID, p.H_ID, c.*
from Table_Main p
inner join Trans a on p.TR_ID = a.TR_ID
left join Table_Main c on c.Parent_id = p.H_ID
where a.Some_date> sysdate-20
order by p.H_ID, c.H_ID
Table_Main
-H_ID
-Parent_ID //Links child to parent
-TR_ID //Links to Trans Table
Trans
-TR_ID //Link to Table_Main
-xx_ID //This is used to group on
-Some_Date //Used to group on
Sample Input:
Table_Main
H_ID Parent_ID TR_ID
1 NULL 1
2 1 NULL
3 NULL 2
4 NULL 3
5 4 NULL
6 4 NULL
7 NULL 4
8 7 NULL
9 NULL 5
10 9 NULL
11 9 NULL
12 9 NULL
13 9 NULL
14 9 NULL
15 9 NULL
16 9 NULL
Trans
TR_ID XX_ID Some_Date
1 45 12/1/2015
2 4 12/1/2015
3 6 12/20/2015
4 45 12/1/2015
5 23 12/22/2015
Desired Output:
Date xx_ID Count
12/1/2015 4 1
12/1/2015 45 4
12/20/2015 6 3
12/22/2015 23 8
Thanks in advance for any help you can provide.
Upvotes: 1
Views: 1748
Reputation: 23578
Thanks for adding the input and expected output data - that makes it a lot easier for us to make sure we have the answer you're expecting!
Here's one way of doing it:
with table_main as (select 1 H_ID, null Parent_ID, 1 TR_ID from dual union all
select 2 H_ID, 1 Parent_ID, NULL TR_ID from dual union all
select 3 H_ID, NULL Parent_ID, 2 TR_ID from dual union all
select 4 H_ID, NULL Parent_ID, 3 TR_ID from dual union all
select 5 H_ID, 4 Parent_ID, NULL TR_ID from dual union all
select 6 H_ID, 4 Parent_ID, NULL TR_ID from dual union all
select 7 H_ID, NULL Parent_ID, 4 TR_ID from dual union all
select 8 H_ID, 7 Parent_ID, NULL TR_ID from dual union all
select 9 H_ID, NULL Parent_ID, 5 TR_ID from dual union all
select 10 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 11 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 12 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 13 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 14 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 15 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 16 H_ID, 9 Parent_ID, NULL TR_ID from dual),
trans as (select 1 TR_ID, 45 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 2 TR_ID, 4 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 3 TR_ID, 6 XX_ID, to_date('12/20/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 4 TR_ID, 45 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 5 TR_ID, 23 XX_ID, to_date('12/22/2015', 'mm/dd/yyyy') Some_Date from dual)
-- end of mimicking your tables with data in them. See SQL below:
select tr1.some_date,
tr1.xx_id,
count(*) cnt
from (select h_id,
parent_id,
max(tr_id) over (partition by connect_by_root(h_id)) tr_id
from table_main tm
connect by prior h_id = parent_id
start with parent_id is null) tm1
inner join trans tr1 on (tm1.tr_id = tr1.tr_id)
group by tr1.some_date,
tr1.xx_id
order by tr1.some_date,
tr1.xx_id;
SOME_DATE XX_ID CNT
---------- ---------- ----------
12/01/2015 4 1
12/01/2015 45 4
12/20/2015 6 3
12/22/2015 23 8
Basically this first does a hierarchical query (connect by...
) to link the parent and child rows.
Then we use the connect_by_root
function to identify the top level h_id across all the parent-child rows.
Once we have that, we can then use an analytic function to return the tr_id across all the parent and child rows per each top level h_id (I've used max() here, as it looks like only parent rows will have a tr_id).
Then it's a simple matter of joining to the trans table and doing the aggregate count.
Here's a (hopefully faster!) modified way of doing the same thing, given that there are only two possible levels of the hierarchy:
with table_main as (select 1 H_ID, null Parent_ID, 1 TR_ID from dual union all
select 2 H_ID, 1 Parent_ID, NULL TR_ID from dual union all
select 3 H_ID, NULL Parent_ID, 2 TR_ID from dual union all
select 4 H_ID, NULL Parent_ID, 3 TR_ID from dual union all
select 5 H_ID, 4 Parent_ID, NULL TR_ID from dual union all
select 6 H_ID, 4 Parent_ID, NULL TR_ID from dual union all
select 7 H_ID, NULL Parent_ID, 4 TR_ID from dual union all
select 8 H_ID, 7 Parent_ID, NULL TR_ID from dual union all
select 9 H_ID, NULL Parent_ID, 5 TR_ID from dual union all
select 10 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 11 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 12 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 13 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 14 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 15 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 16 H_ID, 9 Parent_ID, NULL TR_ID from dual),
trans as (select 1 TR_ID, 45 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 2 TR_ID, 4 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 3 TR_ID, 6 XX_ID, to_date('12/20/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 4 TR_ID, 45 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 5 TR_ID, 23 XX_ID, to_date('12/22/2015', 'mm/dd/yyyy') Some_Date from dual)
-- end of mimicking your tables with data in them. See SQL below:
select tr.some_date,
tr.xx_id,
count(*) cnt
from table_main tm1
left join table_main tm2 on (tm1.h_id = coalesce(tm2.parent_id, tm2.h_id) and tm1.parent_id is null)
inner join trans tr on (tm1.tr_id = tr.tr_id)
group by tr.some_date,
tr.xx_id
order by tr.some_date,
tr.xx_id;
SOME_DATE XX_ID CNT
---------- ---------- ----------
12/01/2015 4 1
12/01/2015 45 4
12/20/2015 6 3
12/22/2015 23 8
Another possible answer that doesn't involve a self join, but does rely on analytic functions:
with table_main as (select 1 H_ID, null Parent_ID, 1 TR_ID from dual union all
select 2 H_ID, 1 Parent_ID, NULL TR_ID from dual union all
select 3 H_ID, NULL Parent_ID, 2 TR_ID from dual union all
select 4 H_ID, NULL Parent_ID, 3 TR_ID from dual union all
select 5 H_ID, 4 Parent_ID, NULL TR_ID from dual union all
select 6 H_ID, 4 Parent_ID, NULL TR_ID from dual union all
select 7 H_ID, NULL Parent_ID, 4 TR_ID from dual union all
select 8 H_ID, 7 Parent_ID, NULL TR_ID from dual union all
select 9 H_ID, NULL Parent_ID, 5 TR_ID from dual union all
select 10 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 11 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 12 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 13 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 14 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 15 H_ID, 9 Parent_ID, NULL TR_ID from dual union all
select 16 H_ID, 9 Parent_ID, NULL TR_ID from dual),
trans as (select 1 TR_ID, 45 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 2 TR_ID, 4 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 3 TR_ID, 6 XX_ID, to_date('12/20/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 4 TR_ID, 45 XX_ID, to_date('12/01/2015', 'mm/dd/yyyy') Some_Date from dual union all
select 5 TR_ID, 23 XX_ID, to_date('12/22/2015', 'mm/dd/yyyy') Some_Date from dual)
-- end of mimicking your tables with data in them. See SQL below:
select tr.some_date,
tr.xx_id,
count(*) cnt
from (select max(tr_id) over (partition by coalesce(parent_id, h_id)) tr_id
from table_main) tm1
inner join trans tr on (tm1.tr_id = tr.tr_id)
group by tr.some_date,
tr.xx_id
order by tr.some_date,
tr.xx_id;
SOME_DATE XX_ID CNT
---------- ---------- ----------
12/01/2015 4 1
12/01/2015 45 4
12/20/2015 6 3
12/22/2015 23 8
Upvotes: 1
Reputation: 167972
Oracle Setup:
CREATE TABLE TRANS(
TR_ID NUMBER(10,0) PRIMARY KEY,
XX_ID NUMBER(10,0),
Some_Date DATE
);
CREATE TABLE TABLE_MAIN (
H_ID NUMBER(10,0) PRIMARY KEY,
PARENT_ID NUMBER(10,0) REFERENCES TABLE_MAIN( H_ID ),
TR_ID NUMBER(10,0) REFERENCES TRANS( TR_ID )
);
INSERT INTO TRANS
SELECT 1, 45, DATE '2015-12-01' FROM DUAL UNION ALL
SELECT 2, 4, DATE '2015-12-01' FROM DUAL UNION ALL
SELECT 3, 6, DATE '2015-12-20' FROM DUAL UNION ALL
SELECT 4, 45, DATE '2015-12-01' FROM DUAL UNION ALL
SELECT 5, 23, DATE '2015-12-22' FROM DUAL;
INSERT INTO TABLE_MAIN
SELECT 1, NULL, 1 FROM DUAL UNION ALL
SELECT 2, 1, NULL FROM DUAL UNION ALL
SELECT 3, NULL, 2 FROM DUAL UNION ALL
SELECT 4, NULL, 3 FROM DUAL UNION ALL
SELECT 5, 4, NULL FROM DUAL UNION ALL
SELECT 6, 4, NULL FROM DUAL UNION ALL
SELECT 7, NULL, 4 FROM DUAL UNION ALL
SELECT 8, 7, NULL FROM DUAL UNION ALL
SELECT 9, NULL, 5 FROM DUAL UNION ALL
SELECT 10, 9, NULL FROM DUAL UNION ALL
SELECT 11, 9, NULL FROM DUAL UNION ALL
SELECT 12, 9, NULL FROM DUAL UNION ALL
SELECT 13, 9, NULL FROM DUAL UNION ALL
SELECT 14, 9, NULL FROM DUAL UNION ALL
SELECT 15, 9, NULL FROM DUAL UNION ALL
SELECT 16, 9, NULL FROM DUAL;
Query:
SELECT some_date,
xx_id,
COUNT(*)
FROM Trans t
INNER JOIN
(
SELECT CONNECT_BY_ROOT( TR_ID ) AS TR_ID
FROM Table_Main
START WITH H_ID IS NOT NULL
CONNECT BY PRIOR H_ID = PARENT_ID
) m
ON ( t.TR_ID = m.TR_ID )
WHERE some_date > SYSDATE - 90
GROUP BY some_date,
xx_id;
Results:
SOME_DATE XX_ID COUNT(*)
--------- ---------- ----------
01-DEC-15 4 1
01-DEC-15 45 4
22-DEC-15 23 8
20-DEC-15 6 3
Upvotes: 1