Pravellika
Pravellika

Reputation: 182

MySQL Categorise data based on two columns

   **Base table:**

   col 1     col2
    25         184
    35         200
    21         75
    10         206
    55         114 
    75         190

    **Expected output:**

             0-50  50-100  100-150  >150
    0-10      0       0       0       1
    10-20     0       1       0       0
    20-30     0       0       0       1
    >30       0       0       1       2

I need to write a MySQL query to print the count of values from two columns(col1, col2) into the rite box in a matrix format (as given above).

Query I had used:

select 
case 
when col1 between 0 and 10 then '0-10'
when col1 between 10 and 20 then '10-20'
when col1 between 20 and 30 then '20-30'
else '>30'
end as freq,
count(
case 
when col2 between 0 and 50 and col1 between 0 and 10 then '0-10'
when col2 between 0 and 50 and col1 between 10 and 20 then '10-20'
when col2 between 0 and 50 and col1 between 20 and 30 then '20-30'
when col2 between 0 and 50 and col1 >30 then '>30'
end) '0-50',
count(
case 
when col2 between 50 and 100 and col1 between 0 and 10 then '0-10'
when col2 between 50 and 100 and col1 between 10 and 20 then '10-20'
when col2 between 50 and 100 and col1 between 20 and 30 then '20-30'
when col2 between 50 and 100 and col1 >30 then '>30'
end) '50-100',
count(
case 
when col2 between 100 and 150 and col1 between 0 and 10 then '0-10'
when col2 between 100 and 150 and col1 between 10 and 20 then '10-20'
when col2 between 100 and 150 and col1 between 20 and 30 then '20-30'
when col2 between 100 and 150 and col1 >30 then '>30'
end) '100-150',
count(
case
when col2>150 and col1 between 0 and 10 then '0-10'
when col2>150 and col1 between 10 and 20 then '10-20'
when col2>150 and col1 between 20 and 30 then '20-30'
when col2>150 and col1 >30 then '>30'
end) '>150',
count(col2)
from table_1 a 
group by freq;

I am not getting the correct count for the records. The correct count appears only for the first row in the output. Not sure why it goes wrong with others. Any help is greatly appreciated.

Thank you.

Upvotes: 0

Views: 87

Answers (3)

Pravellika
Pravellika

Reputation: 182

It was a blunder mistake I made. While including the limits i had used the upper limit in first case statement as the lower limit in the next statement. Thank you so much for all your inputs. I could do the same with two different queries given below:

**Query1:**

select 
case 
when f_value between 0 and 25 then '0-25'
when f_value between 26 and 50 then '25-50'
when f_value between 51 and 75 then '50-75'
when f_value between 76 and 100 then '>75'
else 'others'
end as freq,
count(
case 
when r_value between 0 and 500 and f_value between 0 and 25 then '0-25'
when r_value between 0 and 500 and f_value between 26 and 50 then '25-50'
when r_value between 0 and 500 and f_value between 51 and 75 then '50-75'
when r_value between 0 and 500 and f_value>=76 then '>75'
end) '0-500',
count(
case 
when r_value between 501 and 1000 and f_value between 0 and 25 then '0-25'
when r_value between 501 and 1000 and f_value between 26 and 50 then '25-50'
when r_value between 501 and 1000 and f_value between 51 and 75 then '50-75'
when r_value between 501 and 1000 and f_value>=76 then '>75'
end) '500-1000',
count(
case 
when r_value between 1001 and 1500 and f_value between 0 and 25 then '0-25' 
when r_value between 1001 and 1500 and f_value between 26 and 50 then '25-50' 
when r_value between 1001 and 1500 and f_value between  51 and 75 then '50-75'
when r_value between 1001 and 1500 and f_value>=76 then '>75'
end) '1000-1500',
count(
case
when r_value> 1501  and f_value between 0 and 25 then '0-25' 
when r_value> 1501 and f_value between 26 and 50 then '25-50'  
when r_value> 1501 and f_value between 51 and 75 then '50-75' 
when r_value> 1501 and f_value>=76 then '>=75'
end) '>1501'
from em_mem_prfm_cat_hist a 
where rfm_calc_batch=4
group by freq;


**Query 2:**
SELECT 
    b.x,
    max(if(y = '0-500', sss, null)) '0-500',
    max(if(y = '501-1000', sss, null)) '501-1000',
    max(if(y = '1001-1500', sss, null)) '1001-1500',
    max(if(y = '1501-2000', sss, null)) '1501-2000'
FROM
    ((SELECT 
        x, y, count(1) as sss
    FROM
        (SELECT 
        CASE
                WHEN f_value BETWEEN 0 AND 25 THEN '0-25'
                WHEN f_value BETWEEN 26 AND 50 THEN '26-50'
                WHEN f_value BETWEEN 51 AND 75 THEN '51-75'
                WHEN f_value >= 76 THEN '76-100'
            END AS x,
            CASE
                WHEN r_value BETWEEN 0 AND 500 THEN '0-500'
                WHEN r_value BETWEEN 501 AND 1000 THEN '501-1000'
                WHEN r_value BETWEEN 1001 AND 1500 THEN '1001-1500'
                WHEN r_value >= 1501 THEN '1501-2000'
            END AS y
    FROM
        em_mem_prfm_cat_hist
    where
        rfm_calc_batch = 4) a
    GROUP BY x , y)) b
group by x
order by x

Upvotes: 1

poncha
poncha

Reputation: 7866

Try this:

SELECT
  IF(col1>30, '>30', CONCAT(10 * FLOOR(col1/10),'-',10 * CEIL(col1/10)) AS freq,
  SUM(IF(col2<=50,1,0)) AS '0-50',
  SUM(IF(col2>50 AND col2<=100,1,0)) AS '50-100',
  SUM(IF(col2>100 AND col2<=150,1,0)) AS '100-150',
  SUM(IF(col2>150,1,0)) AS '>150'
 FROM `table`
  GROUP BY IF(col1>30,3,FLOOR(col1/10));

Upvotes: 0

Stephan
Stephan

Reputation: 8090

You need something like:

select 
case 
when col1 between 0 and 10 then '0-10'
when col1 between 10 and 20 then '10-20'
when col1 between 20 and 30 then '20-30'
else '>30'
end as freq,
SUM(
case 
when col2 between 0 and 50 and col1 between 0 and 10 then 1
when col2 between 0 and 50 and col1 between 10 and 20 then 1
when col2 between 0 and 50 and col1 between 20 and 30 then 1
when col2 between 0 and 50 and col1 >30 then 1
ELSE 0
end) '0-50',
...

When pivoting in most of the cases you need SUM and in you case you also need an ELSE on the case so that you will sum 1 or 0 .

Upvotes: 0

Related Questions