Abhishek Ginani
Abhishek Ginani

Reputation: 4751

Return multiple rows from conditional group by without union

I am trying to build a query which supports conditional group by in SQLite DB.

Here is what I tried so far:

SELECT 
      case 
       when A>1 AND B>1 THEN 1
       when X>1 AND Y>1 THEN 2
       when C>1 AND D>1 THEN 3
      END AS data_grp,
     SUM(col1) AS col1,
     SUM(col2) AS col2
FROM tbl
GROUP BY data_grp;

This Works pretty fine if only single case is true at a time. if multiple cases are true in a row then it returns the first case instead of all satisfying groups.

I have tried this by the union which works well but very slow. Is there any other way to fetch results fast with this conditional group.

Sample Data & Expected results:

DROP TABLE IF EXISTS  tbl;
CREATE TABLE tbl
(
    A INT,
    B INT,
    C INT,
    D INT,
    X INT,
    Y INT,
    col1 int,
    col2 int

);

INSERT INTO tbl(A,B,C,D,X,Y,col1,col2) values (2,3,0,0,0,0,5,10);
INSERT INTO tbl(A,B,C,D,X,Y,col1,col2) values (0,0,0,0,8,10,3,2);
INSERT INTO tbl(A,B,C,D,X,Y,col1,col2) values (5,4,4,9,0,0,3,2);

    SELECT 
          case 
           when A>1 AND B>1 THEN 1
           when X>1 AND Y>1 THEN 2
           when C>1 AND D>1 THEN 3
          END AS data_grp,
         SUM(col1) AS col1,
         SUM(col2) AS col2
    FROM tbl
    GROUP BY data_grp;

Query Output :

"1" "8" "12"
"2" "3" "2"

Expected Output :

"1" "8" "12"
"2" "3" "2"
"3" "3" "2"

Upvotes: 1

Views: 104

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1270391

I am wary of summarizing data, where the result is on multiple rows and the totals don't match the original data. Of course, sometimes it is necessary, but here are two alternatives.

If you can be slightly flexible in your results, then you can concat the conditions together to get a more complex group:

SELECT ( (CASE WHEN A > 1 AND B > 1 THEN '1' ELSE '' END) ||
         (CASE WHEN X > 1 AND Y > 1 THEN '2' ELSE '' END) ||
         (CASE WHEN C > 1 AND D > 1 THEN '3' ELSE '' END)
       ) AS data_grp,
       SUM(col1) AS col1, SUM(col2) AS col2
FROM tbl
GROUP BY data_grp;

I would actually write this as:

SELECT ( (CASE WHEN A > 1 AND B > 1 THEN '1' ELSE '0' END) ||
         (CASE WHEN X > 1 AND Y > 1 THEN '1' ELSE '0' END) ||
         (CASE WHEN C > 1 AND D > 1 THEN '1' ELSE '0' END)
       ) AS data_grp,

So data_grp gets a string of 0's and 1's indicating the group.

These results are not the same as your results. They are more what I would want, if I were looking at different groups -- I would want to see the overlaps between the groups.

Or, I would put the values in separate columns:

SELECT SUM(CASE WHEN A > 1 AND B > 1 THEN col1 ELSE 0 END) as sum1_1,
       SUM(CASE WHEN X > 1 AND Y > 1 THEN col1 ELSE 0 END) as sum1_2,
       SUM(CASE WHEN C > 1 AND D > 1 THEN col1 ELSE 0 END) as sum1_3,
       SUM(CASE WHEN A > 1 AND B > 1 THEN col2 ELSE 0 END) as sum2_1,
       SUM(CASE WHEN X > 1 AND Y > 1 THEN col2 ELSE 0 END) as sum2_2,
       SUM(CASE WHEN C > 1 AND D > 1 THEN col2 ELSE 0 END) as sum2_3
FROM tbl;

These are the same results, but pivoted differently.

Upvotes: 1

Peeyush
Peeyush

Reputation: 726

You can not use GROUP BY directly because of the overlapping groups. You can use something like following, although this may also be slow.

WITH RECURSIVE
  cnt(x) AS (
     SELECT 1
     UNION ALL
     SELECT x+1 FROM cnt
      LIMIT 3
  )
SELECT x as data_grp, sum(col1), sum(col2)
FROM cnt, 
      (SELECT 
          case when A>1 AND B>1 THEN 1  ELSE 0 END as dg1,
          case when X>1 AND Y>1 THEN 2 ELSE 0 END as dg2,
          case when C>1 AND D>1 THEN 3 ELSE 0 END as dg3,
          col1, col2
        FROM tbl) t WHERE x=dg1 or x=dg2 or x=dg3
GROUP BY x

Upvotes: 1

Related Questions