Patrick
Patrick

Reputation: 2577

Simplify a redundant left join

Using Oracle, I'm looking to do the following query, but I'd like to know if there is a more "intelligent" way to do it.

Select * from Sales Sales1
left join Sales Sales2 on Sales2.val = Sales1.val
left join Sales Sales3 on Sales3.val = Sales2.val
left join Sales Sales4 on Sales4.val = Sales3.val
left join Sales Sales5 on Sales5.val = Sales4.val
...

Here's what my sample data might look like

customer number | acct | start balance | open date | prev account 
       a            1         100         01-01-15       b-1     
       b            1          80         03-04-14       
       c            2         200         04-11-14       c-1
       c            1         150         06-12-15        
       d            1         600         08-16-15         
       e            3         400         12-19-15       e-2
       e            2         150         10-21-14       e-1
       e            1         100         01-18-13       

And a result set would look like this:

   Customer | start | open    | prevStart_01 | prevOpen_01 | prevStart_02 | prevOpen_02    
        a-1 |    100| 01-01-15|            80 |   03-04-14 |             | 
        c-2 |    200| 04-11-14|           150 |   06-11-14 |             | 
        e-3 |    400| 12-19-15|           150 |   10-21-14 |          100|   01-18-13 

As you can see, I need to keep joining another record of sales based upon the result, and I need to keep doing so until I return an empty result set. My current scenario is running the query and seeing whether there are values in sales5, sales6, sales7, and so on.

Upvotes: 0

Views: 180

Answers (1)

Matthew McPeak
Matthew McPeak

Reputation: 17944

Whenever you have to self-join an unknown number of times, you should be thinking CONNECT BY. Your particular need here isn't so straightforward, but CONNECT BY is still the key element of the solution.

In the SQL below, the mockup_data subfactor is just to give me some data. You'd use your actual table.

The idea is that you search your data for "root" -- records that are not a prev_account of any other record. Then, you start with those and CONNECT BY to get all their previous accounts, as many as there are. Then you PIVOT to get them all into columns.

One thing -- an Oracle SQL statement cannot have an arbitrary (data-driven) number of columns. The number must be known when the SQL is parsed. Therefore, in your PIVOT clause, you need to specify the maximum number of "levels" you'll support, so that Oracle knows how many columns the result set could have.

Here's the SQL.

WITH 
mockup_data as (
SELECT   
'a' customer_Number,           1 acct,         100 start_balance,        to_date('01-01-15','MM-DD-YY') open_date,       'b-1' prev_account from dual union all      
SELECT 'b'            ,1,          80,         to_date('03-04-14','MM-DD-YY'), null       from dual union all 
SELECT 'c'            ,2,         200,         to_date('04-11-14','MM-DD-YY'),       'c-1' from dual union all 
SELECT 'c'            ,1,         150,         to_date('06-12-15','MM-DD-YY'),        null from dual union all 
SELECT 'd'            ,1,         600,         to_date('08-16-15','MM-DD-YY'),        null from dual union all 
SELECT 'e'            ,3,         400,         to_date('12-19-15','MM-DD-YY'),       'e-2' from dual union all 
SELECT 'e'            ,2,         150,         to_date('10-21-14','MM-DD-YY'),     'e-1' from dual union all 
SELECT 'e'            ,1,         100,         to_date('01-18-13','MM-DD-YY'),     null  from dual ),
data_with_roots AS
       (SELECT d.*,
               CASE
                 WHEN (SELECT COUNT (*)
                       FROM   mockup_data d2
                       WHERE  d2.prev_account = d.customer_number || '-' || d.acct) = 0 THEN
                   'Y'
                 ELSE
                   'N'
               END
                 is_root
        FROM   mockup_data d),
     hierarchy AS
       (SELECT CONNECT_BY_ROOT (customer_number) customer_number,
               CONNECT_BY_ROOT (acct) acct,
               CONNECT_BY_ROOT (start_balance) start_balance,
               CONNECT_BY_ROOT (open_date) open_date,
               start_balance prev_start_balance,
               open_date prev_open_date,
               LEVEL - 1 lvl
        FROM   data_with_roots d
        CONNECT BY customer_number || '-' || acct = PRIOR prev_account
        START WITH is_root = 'Y'),
     previous_only AS
       (SELECT *
        FROM   hierarchy
        WHERE  lvl >= 1)
SELECT *
FROM   previous_only PIVOT (MAX (prev_start_balance) AS prev_start, MAX (prev_open_date) AS prev_open
                     FOR lvl
                     IN (1 AS "01", 2 AS "02", 3 AS "03", 4 AS "04", 5 AS "05" -- etc... as many levels as you need to support
                                                                              ));

Upvotes: 2

Related Questions