Jun
Jun

Reputation: 21

What is the most effective way of creating indexes under certain situation

OK, this is my condition and questions.. Assume that this is the only SQL we are using for tables/indexes

SELECT B.DIST_NM
, COUNT(+)
  FROM CUST A, DIST B
 WHERE A.COUNTRY_CD = 'USA'
   AND A.CUST_CD = B.CUST_CD
   AND A.CUST_ID IN (SELECT CUST_ID FROM ORDS WHERE ORDER_CD = '20')
   AND A.CUST_ID IN (SELECT CUST_ID FROM CUST_INFO WHERE INFO = 'N')
 GROUP BY B.DIST_NM

 ........................................

SELECT COUNT(*) FROM CUST WHERE A.COUNTRY_CD = 'USA' -> about half million rows
SELECT CUST_ID FROM ORDS WHERE ORDER_CD = '20' -> about 1.5 million rows
SELECT CUST_ID FROM CUST_INFO WHERE INFO = 'N' -> about 50 rows
  1. what are the most effective of creating index for each table ?

    -CUST : COUNTRY_CD + CUST_ID

    -DIST : CUST_CD + DIST_NM

    -ORDS : CUST_ID + ORDER_CD

    -CUST_INFO : INFO + CUST_ID

---> I think above indexes are the most effective indexes. but is there anything else I should think about?

  1. Accroding to the index above what is the execution plan going to look like? (join orders? ways? subquery roles?, can change sql if outcome is the same )

Can you explain this to me please... I am thinking since 1. SELECT CUST_ID FROM CUST_INFO WHERE INFO has the least rows it should be executed for the first time, 2. then join with CUST table, 3. then filter with SELECT CUST_ID FROM ORDS WHERE ORDER_CD = '20' subquery output, 4. then join with DIST table and then perform group by ...

Am I on the right track ?? Can anyone give me an advise??

Upvotes: 2

Views: 76

Answers (2)

Thorsten Kettner
Thorsten Kettner

Reputation: 94959

Let's think about how the DBMS may approach the task:

  1. Read ORDS: index idx1 on ords (order_cd, cust_id)
  2. Read CUST_INFO: index idx2 on cust_info (info, cust_id)
  3. Read CUST: index idx3 on cust (country_cd, cust_id, cust_cd) or index idx4 on cust (cust_id, country_cd, cust_cd)
  4. Join DIST: index idx5 on dist (cust_cd, dist_nm)

Or maybe so:

  1. Read CUST: index idx3 on cust (country_cd, cust_id, cust_cd) or index idx6 on cust (country_cd, cust_cd, cust_id)
  2. Look up ORDS: index idx7 on cust (cust_id, order_cd)
  3. Look up CUST_INFO: index idx8 on cust_info (cust_id, info)
  4. Join DIST: index idx5 on dist (cust_cd, dist_nm)

That makes 8 indexes that may or may not be used. Create them, use EXPLAIN PLAN to see which are used, then drop the others :-)

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270011

I would write the query as:

SELECT d.DIST_NM, COUNT(*)
FROM CUST c JOIN
     DIST d
     ON c.CUST_CD = d.CUST_CD
WHERE c.COUNTRY_CD = 'USA' AND
      c.CUST_ID IN (SELECT o.CUST_ID FROM ORDS o WHERE o.ORDER_CD = '20')
      c.CUST_ID IN (SELECT ci.CUST_ID FROM CUST_INFO ci WHERE ci.INFO = 'N')
GROUP BY d.DIST_NM;

I would recommend indexes on: CUST(COUNTRY_CD, CUST_ID, CUST_CD), DIST(CUST_CD, DIST_NM), ORDS(ORDER_CD, CUST_ID) AND CUST_INFO(INFO, CUST_ID).

Note the order of the keys in the indexes.

Upvotes: 1

Related Questions