sameh.q
sameh.q

Reputation: 1709

Get rows of group if column data is different

I am having a problem in the final step of my query, which is to select all rows of a group of rows that have a mix values of a column called NATIONALITY.

I have tried giving the NATIONALITY column a number representation as 1 for X and 0 for other nationalities and then use MIN MAX on groups as this example, but still, I am not able to find a way to get all the rows in the group that have data of that nature

Check the below example for illustration:

Sorted data and handled data will look like this:

GROUP_ID    VALUE_1    VALUE_2 ..... VALUE_N   NATIONALITY
------------------------------------------------------------
  1           282       8383           383         X
  1           737        722           712         Y
  1           273       7123           123         Y

  2           347        234           235         X
  2          8723        274           474         Y

  3           747         23           623         X

  4           133       1823           612         Y
  4           747        533           100         Y

  5           343       2383           213         X
  5           333        123           711         X
  5           312        344           766         X
  5           456       6003           190         Y

Target result:

GROUP_ID    VALUE_1    VALUE_2 ..... VALUE_N   NATIONALITY
------------------------------------------------------------
  1           282       8383           383         X
  1           737        722           712         Y
  1           273       7123           123         Y

  2           347        234           235         X
  2          8723        274           474         Y

  5           343       2383           213         X
  5           333        123           711         X
  5           312        344           766         X
  5           456       6003           190         Y

I am trying to select all rows within a group that has X(s) and Y(s) at the same time

Can anyone give me a hint of how to tackle this? I really don't know from where to start, Analytical functions is not useful, LEAD function didn't serve me, because number of rows in each group is not fixed, so leading row could have the value X, but the one after it in the group is Y, so in this case this group will be ignored, which will be wrong in my case.

Upvotes: 1

Views: 67

Answers (3)

user5683823
user5683823

Reputation:

In Oracle 12c, the MATCH_RECOGNIZE clause can solve this very quickly and cleanly.

with
     test_data ( group_id, value_1, value_2, value_n, nationality ) as (
       select 1,  282, 8383, 383, 'X' from dual union all
       select 1,  737,  722, 712, 'Y' from dual union all
       select 1,  273, 7123, 123, 'Y' from dual union all
       select 2,  347,  234, 235, 'X' from dual union all
       select 2, 8723,  274, 474, 'Y' from dual union all
       select 3,  747,   23, 623, 'X' from dual union all
       select 4,  133, 1823, 612, 'Y' from dual union all
       select 4,  747,  533, 100, 'Y' from dual union all
       select 5,  343, 2383, 213, 'X' from dual union all
       select 5,  333,  123, 711, 'X' from dual union all
       select 5,  312,  344, 766, 'X' from dual union all
       select 5,  456, 6003, 190, 'Y' from dual
     )
-- End of test data (not part of the solution). SQL query begins BELOW THIS LINE.
select *
from   test_data
match_recognize (
  partition by group_id
  all rows per match
  pattern ( a*? ( x a*? y | y a*? x ) a* )
  define  x as nationality = 'X',
          y as nationality = 'Y'
)
;

Output:

GROUP_ID    VALUE_1    VALUE_2    VALUE_N  NATIONALITY
--------  ---------  ---------  ---------  -----------
       1        282       8383        383  X
       1        737        722        712  Y
       1        273       7123        123  Y
       2        347        234        235  X
       2       8723        274        474  Y
       5        343       2383        213  X
       5        333        123        711  X
       5        312        344        766  X
       5        456       6003        190  Y

9 rows selected.

Upvotes: 0

MT0
MT0

Reputation: 167972

I am trying to select all rows within a group that has X(s) and Y(s) at the same time

Use the COUNT( ... ) OVER ( PARTITION BY ... ) analytic function with a CASE statement to restrict the count to just Xs or Ys within the partition:

SELECT *
FROM   (
  SELECT t.*,
         COUNT( CASE nationality WHEN 'X' THEN 1 END )
            OVER ( PARTITION BY group_id ) AS x,
         COUNT( CASE nationality WHEN 'y' THEN 1 END )
            OVER ( PARTITION BY group_id ) AS y
  FROM   your_table t
)
WHERE  x > 0 AND y > 0

Upvotes: 3

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521194

One approach here is to count the number of distinct nationalities which occur for each group, and then retain only those groups having more than one distinct nationality.

SELECT
    t.GROUP_ID,
    t.VALUE_1,
    t.VALUE_2,
    ...,
    t.VALUE_N,
    t.NATIONALITY
FROM
(
    SELECT *,
           COUNT(DISTINCT NATIONALITY) OVER (PARTITION BY GROUP_ID) distinct_count
    FROM yourTable
) t
WHERE t.distinct_count > 1

Of course, this query counts any two distinct nationalities as being valid for your result set. We could modify this query if you really need to explicitly check for X and Y.

Upvotes: 3

Related Questions