user2614856
user2614856

Reputation: 53

trouble using group by clause

I have data as follows

A        |  B      | C            |   D          |   E     |  F
217731   |  24856  | 01/01/2006   | 03/31/2007   |  0569   | 972450
217731   |  24856  | 04/01/2007   | 09/30/2008   |  0569   | 972450
217731   |  24856  | 10/01/2008   | 12/31/2008   |  0569   | 972450
217731   |  24856  | 01/01/2009   | 12/31/2009   |  0569   | 972450

217731   |  24856  | 01/01/2010   | 09/30/2011   |  0569   | 318704
217731   |  24856  | 10/01/2011   | 03/04/2012   |  0569   | 318704

217731   |  24856  | 03/05/2012   | 09/30/2012   |  0569   | 972450
217731   |  24856  | 10/01/2012   | 07/31/2013   |  0569   | 972450

And i need output as

 A     |  B     |  C          | D           | F
217731 | 24856  |  01/01/2006 | 31/12/2009  | 972450
217731 | 24856  |  01/01/2010 | 04/03/2012  | 318704
217731 | 24856  |  05/03/2012 | 31/07/2013  | 972450

when i use group by clause i get output as follows

 A      |  B      | C          | D          | F
217731  | 24856   | 01/01/2006 | 07/31/2013 | 972450
217731  | 24856   | 01/01/2010 | 03/04/2012 | 318704

Upvotes: 0

Views: 95

Answers (2)

valex
valex

Reputation: 24134

Also try to use this query if your table isn't big:

select A,B,min(C),max(D),F
from
(
select t2.*,
(select count(A) from t where 
  (t.A<>t2.A or t.B<>t2.B or t.F<>t2.F) 
   and t.c<t2.c) Group_n
from t t2
 ) t3
group by A,B,F,Group_n

SQLFiddle demo

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269553

You are trying to identify different periods of time that have the same keys. Fortunately, Oracle has a wealth of analytic functions for this, because a simple group by is not sufficient.

Here is the logic that the below query uses. The innermost subquery creates a flag (StartPeriod) based on the dates and key columns in the row. This identifies when a new period begins, because column C is not 1 day larger than the previous column D.

Then, the StartPeriod is accumulated as a cumulative sum. This assigns the same value to all groups in the same period. At this point, there is enough information for using group by. You need to include the additional grping column.

select A, B, MIN(C) as C, MAX(D) as D, E, F
from (select t.*, SUM(StartPeriod) over (partition by A, B, E, F order by C) as grping
      from (select t.*,
                   (case when lag(D) over (partition by A, B, E, F order by C) = C - 1
                         then 0
                         else 1
                    end) as StartPeriod
            from t
           ) t
     ) t
group by A, B, E, F, grping

Upvotes: 1

Related Questions