user1492226
user1492226

Reputation:

Eliminate Duplicate Rows on Outer Join

I am running a query against our Oracle database.

The goal is return the following columns -

  1. Document Id
  2. Document Creation Date
  3. Organization Code
  4. Document Status
  5. Total Amount

The problem I am running into is with the Organization Code. It is possible to have a document id with multiple organization codes. I only want 1 instance - I don't care about the rest (if they exist)

Here is what I currently have -

    SELECT * FROM (SELECT DISTINCT (K_HDR.DOC_HDR_ID), 
        K_HDR.CRTE_DT, 
        FS_EXT.VAL AS ORG_CODE,
        REQ.REQS_STAT_CD,
        FS_DOC.FDOC_TOTAL_AMT
    FROM PUR_REQS_T REQ, 
        KREW_DOC_HDR_T K_HDR, 
        FS_DOC_HEADER_T FS_DOC,
        KREW_DOC_HDR_EXT_T FS_EXT
    WHERE REQ.FDOC_NBR = K_HDR.DOC_HDR_ID AND
        FS_DOC.FDOC_NBR = REQ.FDOC_NBR AND
        REQ.FDOC_NBR = FS_EXT.DOC_HDR_ID(+) AND
        FS_EXT.KEY_CD(+)= 'organizationCode' AND
        (K_HDR.CRTE_DT BETWEEN TO_DATE('2011-10-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') 
        AND 
         TO_DATE('2012-09-30 23:59:59', 'YYYY-MM-DD HH24:MI:SS')))
    FINAL_SEARCH ORDER BY FINAL_SEARCH.CRTE_DT;

The following query returns 14,933 rows. The correct amount of rows I should be getting is 14,789.

The culprit is the Organization Code. For instance, as I'm looking at the result sets I see the following -

 DOC_ID   CRTE_DT      ORG_CD    STAT    TOTAL
 .
 .
 .
 496256    5-OCT-11     0        CLOS    2779.89
 496258    5-OCT-11     8050     CLOS    1737.5
 496258    5-OCT-11     8000     CLOS    1737.5
 .
 .
 .

How do I get rid of the annoying 2nd instance of 496258 which lives in the FS_EXT Table? (Obviously I need to get rid of the other instances of the same type of duplicate values)

Upvotes: 0

Views: 2166

Answers (2)

user1492226
user1492226

Reputation:

So - I ended up using another column in the FS_EXT Table to further filter down to the first instance of the Org Code.

Here is what the FS_EXT Table looks like if I am looking at columns that are filtered to only show entries for Document Id = 496258.
(Mind you that there could be different number of rows for any given doc id)

 DOC_HDR_EXT_ID     DOC_HDR_ID     KEY_CD                      VAL
 13318096           496258         documentDescription         misc items
 13318098           496258         organizationDocNumber       (null)
 13318099           496258         statusDescription           Closed
 13318101           496258         chartAndOrgCodeForResult    KS-1234
 13318102           496258         vendorName                  APPLE COMPUTERS
 13318103
 .
 .
 .
 .
 .
 13318115           496258         organizationCode            8000
 13318116
 .
 .
 .
 1338118            496258         organizationCode            8050

And here is my new query which circumvents using THE JOIN OPERATION.

Notice that I use a SUBQUERY instead. To get the first instance of the OrganizationCode, I use the MIN operator on the DOC_HDR_EXT_ID column and then retrieve the organizationCode VAL using that ID and pass that back to the main QUERY.

    SELECT * FROM ( SELECT DISTINCT (K_HDR.DOC_HDR_ID), 
         K_HDR.CRTE_DT, 
         (SELECT KS_EXT.VAL AS ORG_CODE 
              FROM KREW_DOC_HDR_EXT_T KS_EXT 
              WHERE KS_EXT.DOC_HDR_EXT_ID =(
              SELECT MIN(DOC_HDR_EXT_ID) 
              FROM KREW_DOC_HDR_EXT_T FS_EXT_INNER
              WHERE FS_EXT_INNER.DOC_HDR_ID = K_HDR.DOC_HDR_ID 
              AND FS_EXT_INNER.KEY_CD = 'organizationCode')) AS ORG_CODE,
         REQ.REQS_STAT_CD,
         FS_DOC.FDOC_TOTAL_AMT
         FROM PUR_REQS_T REQ, 
         KREW_DOC_HDR_T K_HDR, 
         FS_DOC_HEADER_T FS_DOC,
         KREW_DOC_HDR_EXT_T FS_EXT
         WHERE REQ.FDOC_NBR = K_HDR.DOC_HDR_ID AND
               FS_DOC.FDOC_NBR = REQ.FDOC_NBR AND
               REQ.FDOC_NBR = FS_EXT.DOC_HDR_ID(+) AND
               FS_EXT.KEY_CD(+)= 'organizationCode' AND
               (K_HDR.CRTE_DT BETWEEN TO_DATE('2011-10-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') AND TO_DATE('2012-09-30 23:59:59', 'YYYY-MM-DD HH24:MI:SS')))
     FINAL_SEARCH ORDER BY FINAL_SEARCH.CRTE_DT;

Thanks for your recommendation @Alex Poole and @StilesCrisis. You got me thinking differently about my approach to this problem and my solutions integrates both of your suggestions. MIN approach from Stiles and filtering another column per Alex Poole.

Upvotes: 1

StilesCrisis
StilesCrisis

Reputation: 16290

You could wrap the whole thing in one more SELECT which uses a GROUP BY to get only the MIN organization code.

Upvotes: 1

Related Questions