Sid
Sid

Reputation: 582

Oracle Query optimization for huge table for report

I am trying to get the output of the following query. It's taking forever and no output. Can you please help in optimizing the query? I tried using hint parallel but still no output. TabA has about 1.2 million and TabB 0.7 million records. Database is Oracle.

WITH TAB1 as (
SELECT
P_TXN_ID,PROD_CD,P_TYPE1,P_TYPE2,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='FBO' AND LAST_UPD_DT='20-NOV-17' ) FBO,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='ORG' AND LAST_UPD_DT='20-NOV-17' ) ORG,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='SEND' AND LAST_UPD_DT='20-NOV-17' )  SEND,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='RCV' AND LAST_UPD_DT='20-NOV-17' ) RCV,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='SCND' AND LAST_UPD_DT='20-NOV-17' ) SCND,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='BENE' AND LAST_UPD_DT='20-NOV-17' ) BENE,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='INT1' AND LAST_UPD_DT='20-NOV-17' ) INT1,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='INT2' AND LAST_UPD_DT='20-NOV-17' ) INT2,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='INT3' AND LAST_UPD_DT='20-NOV-17' ) INT3,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='INT4' AND LAST_UPD_DT='20-NOV-17' ) INT4,
(SELECT COALESCE(FT.V_PARTY_ID_TYPE,'XX') FROM TabA FT WHERE FT.P_TXN_ID=ST.P_TXN_ID AND FT.P_ROLE='INT5' AND LAST_UPD_DT='20-NOV-17' ) INT5
FROM 
(SELECT PROD_CD,P_TYPE1,P_TYPE2,LAST_UPD_DT,P_TXN_ID 
FROM TabB
) ST
WHERE LAST_UPD_DT>='20-NOV-17' AND LAST_UPD_DT<='22-NOV-17'  
)

SELECT /*+ parallel(100)*/
COUNT(*),
PROD_CD,
P_TYPE1,
FBO,ORG,SEND,RCV,SCND,BENE,
INT1,INT2,INT3,INT4,INT5
FROM 
TAB1 
GROUP BY PROD_CD,
P_TYPE1,
FBO,ORG,SEND,RCV,SCND,BENE,
INT1,INT2,INT3,INT4,INT5
ORDER BY PROD_CD;

Thanks in advance. Just to make things easy, below is the example above query is trying to achieve. Say there is table TXN with following data:

TXN
Col1 Col2 Col3 Type
1    AA    abc  FBO
2    null  abc  FBO
3    BB    abc  ORG
4    CC    def  ORG
5    DD    def  ORG
6    EE    pqr  SCND
7    EE    pqr  SCND
8    CC    def  RCV

Then output of the queries will be:

 Col2   Col3 FBO ORG SCND RCV
 AA     abc    1   0    0   0
 null   abc    1   0    0   0
 BB     abc    1   0    0   0
 CC     def    0   1    0   1
 DD     def    0   1    0   0
 EE     pqr    0   0    2   0

Upvotes: 1

Views: 697

Answers (4)

Paul Maxwell
Paul Maxwell

Reputation: 35603

To solve your sample data to result, you just need "conditional aggregates" , basically this means for your query count(case when ... then 1 end) as follows:

SELECT
    col2
  , col3
  , COUNT( CASE WHEN type = 'FBO'  THEN 1 END ) AS FBO
  , COUNT( CASE WHEN type = 'ORG'  THEN 1 END ) AS ORG
  , COUNT( CASE WHEN type = 'SCND' THEN 1 END ) AS SCND
  , COUNT( CASE WHEN type = 'RCV'  THEN 1 END ) AS RCV
FROM txn
GROUP BY
    col2
  , col3
ORDER BY
    col2
  , col3

and the result is:

+----+------+------+-----+-----+------+-----+
|    | COL2 | COL3 | FBO | ORG | SCND | RCV |
+----+------+------+-----+-----+------+-----+
|  1 | AA   | abc  |   1 |   0 |    0 |   0 |
|  2 | BB   | abc  |   0 |   1 |    0 |   0 |
|  3 | CC   | def  |   0 |   1 |    0 |   1 |
|  4 | DD   | def  |   0 |   1 |    0 |   0 |
|  5 | EE   | pqr  |   0 |   0 |    2 |   0 |
|  6 | NULL | abc  |   1 |   0 |    0 |   0 |
+----+------+------+-----+-----+------+-----+

so you can do this:

SELECT
    b.PROD_CD
  , b.P_TYPE1
  , b.P_TYPE2
  , COUNT( CASE WHEN FT.P_ROLE = 'FBO'  THEN a.V_PARTY_ID_TYPE END ) AS FBO
  , COUNT( CASE WHEN FT.P_ROLE = 'ORG'  THEN a.V_PARTY_ID_TYPE END ) AS ORG
  , COUNT( CASE WHEN FT.P_ROLE = 'SEND' THEN a.V_PARTY_ID_TYPE END ) AS SEND
  , COUNT( CASE WHEN FT.P_ROLE = 'RCV'  THEN a.V_PARTY_ID_TYPE END ) AS RCV
  , COUNT( CASE WHEN FT.P_ROLE = 'SCND' THEN a.V_PARTY_ID_TYPE END ) AS SCND
  , COUNT( CASE WHEN FT.P_ROLE = 'BENE' THEN a.V_PARTY_ID_TYPE END ) AS BENE
  , COUNT( CASE WHEN FT.P_ROLE = 'FBO'  THEN a.V_PARTY_ID_TYPE END ) AS FBO
  , COUNT( CASE WHEN FT.P_ROLE = 'INT1' THEN a.V_PARTY_ID_TYPE END ) AS INT1
  , COUNT( CASE WHEN FT.P_ROLE = 'INT2' THEN a.V_PARTY_ID_TYPE END ) AS INT2
  , COUNT( CASE WHEN FT.P_ROLE = 'INT3' THEN a.V_PARTY_ID_TYPE END ) AS INT3
  , COUNT( CASE WHEN FT.P_ROLE = 'INT4' THEN a.V_PARTY_ID_TYPE END ) AS INT4
  , COUNT( CASE WHEN FT.P_ROLE = 'INT5' THEN a.V_PARTY_ID_TYPE END ) AS INT5
FROM TabB b
JOIN TabA a ON a.P_TXN_ID = b.P_TXN_ID
AND a.LAST_UPD_DT = DATE '2017-11-20'
WHERE LAST_UPD_DT >= DATE '2017-11-20' AND LAST_UPD_DT <= DATE '2017-11-22'
GROUP BY
    PROD_CD
  , P_TYPE1
  , P_TYPE2;

But do note that IF a.V_PARTY_ID_TYPE is NULL then the count will NOT increment by 1.

Upvotes: 0

Marmite Bomber
Marmite Bomber

Reputation: 21095

You didn't post the execution plan so only guess based recommendation are possible.

Here a check list:

1) The table TABA is not indexed - obviously leading to lot of FULL TABLE SCANs

2) The table TABA is inappropriately indexed - e.g. there is an index on P_ROLE, LAST_UPD_DT which doesn't help, because P_TXN_ID is missing or

only the column P_TXN_ID is indexed and you have to scan a large number or row before you match the role and date

3) The table TABA is appropriately indexed, i.e. on P_TXN_ID, P_ROLE, LAST_UPD_DT but you process a large number of rows in TABB

Example to process 100K row from TABB you need 100K * 11 = 1.1M (you subquery count) index access. For large table with index on disc you will get no more that about 100 row per second. So you will be far better to get rid of it, perform a hash join of your tables.

Last but not least - your approach with parallel is similar to the case when your tires are empty and you react with full throttle.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270713

I find it hard to follow your code and your explanation. However, I'm pretty sure that conditional aggregation is all that you need.

Something like this:

SELECT b.PROD_CD, b.P_TYPE1, b.P_TYPE2,
       MAX(CASE WHEN FT.P_ROLE = 'FBO' THEN a.V_PARTY_ID_TYPE END) as FBO,
       MAX(CASE WHEN FT.P_ROLE = 'ORG' THEN a.V_PARTY_ID_TYPE END) as ORG,
       MAX(CASE WHEN FT.P_ROLE = 'SEND' THEN a.V_PARTY_ID_TYPE END) as SEND,
       MAX(CASE WHEN FT.P_ROLE = 'RCV' THEN a.V_PARTY_ID_TYPE END) as RCV,
       MAX(CASE WHEN FT.P_ROLE = 'SCND' THEN a.V_PARTY_ID_TYPE END) as SCND,
       MAX(CASE WHEN FT.P_ROLE = 'BENE' THEN a.V_PARTY_ID_TYPE END) as BENE,
       MAX(CASE WHEN FT.P_ROLE = 'FBO' THEN a.V_PARTY_ID_TYPE END) as FBO,
       MAX(CASE WHEN FT.P_ROLE = 'INT1' THEN a.V_PARTY_ID_TYPE END) as INT1,
       MAX(CASE WHEN FT.P_ROLE = 'INT2' THEN a.V_PARTY_ID_TYPE END) as INT2,
       MAX(CASE WHEN FT.P_ROLE = 'INT3' THEN a.V_PARTY_ID_TYPE END) as INT3,
       MAX(CASE WHEN FT.P_ROLE = 'INT4' THEN a.V_PARTY_ID_TYPE END) as INT4
       MAX(CASE WHEN FT.P_ROLE = 'INT5' THEN a.V_PARTY_ID_TYPE END) as INT5
FROM TabB b JOIN
     TabA a
      ON a.P_TXN_ID = b.P_TXN_ID AND
         a.LAST_UPD_DT = DATE '2017-11-20'
WHERE LAST_UPD_DT >= DATE '2017-11-20' AND LAST_UPD_DT <= DATE '2017-11-22'
GROUP BY PROD_CD, P_TYPE1, P_TYPE2;

Upvotes: 1

Florin Ghita
Florin Ghita

Reputation: 17643

  1. If tabB is partitioned by LAST_UPD_DT or indexed by LAST_UPD_DT, just be sure that '20-NOV-17'and '22-NOV-17' are Dates. For this, use to_date: to_date('22-NOV-2017','dd-mon-yyyy') etc. (It may by an idea to index this column but if you really update records, your updates will be slow.)
  2. do not use parallel 100 but parallel 8 or 16. 100 is too heavy and it may cripple the database.
  3. do you need order by PROD_CD? why? It will by faster without it.

Upvotes: 0

Related Questions