User277883
User277883

Reputation: 57

Subquery with nested if else condition

I have the following 3 calculated columns which define the training status for each employee:

  1. training status in 2023 calculated as if [test attribute] = 'e1' and year([assigned date]) = 2023 then 'overdue' elseif [test attribute] = 'e2' and year([assigned date]) = 2023 then 'late completion' else '' end;

  2. training status in 2022 calculated as if [test attribute] = 'e1' and year([assigned date]) = 2022 then 'overdue' ;

  3. training status in 2021 calculated as if [test attribute] = 'e1' and year([assigned date]) = 2021 then 'overdue';

I want to get the exceptions count where if training status in 2023 = e1 or e2 then that population and if training status for 2021 or training status for 2022 = e1 then from those ids.

In the sample data below, the exceptions count should be 3 (Henry, Sam and Tom). We don't count Mike because his training status in 2023 is neither e1 nor e2.

CREATE TABLE users (
  id INT,
  name VARCHAR(255) NOT NULL,
  test attribute VARCHAR(255),
  assigned_date DATE,
  type VARCHAR(255) NOT NULL
);

INSERT INTO users
(id,    name,    test attribute, assigned_date, type ) VALUES
(102,   'Henry', 'E1',           '2023-05-01',  'comp'),
(102,   'Henry', 'E1',           '2022-01-30',  'comp'),
(102,   'Henry', 'E2',           '2023-05-01',  'comp'),
(101,   'Mike',  'X',            '2023-10-21',  'other'),
(101,   'Mike',  'E1',           '2022-06-19',  'other'),
(100,   'Sam',   'E2',           '2023-11-12',  'comp'),
(100,   'Sam',   'E1',           '2021-09-18',  'other'),
(100,   'Sam',   'E1',           '2023-11-12',  'other'),
(103,   'Tom',   'X',            '2023-11-12',  'other'),
(103,   'Tom',   'E1',           '2021-09-18',  'comp'),
(103,   'Tom',   'E2',           '2023-11-12',  'other');

SELECT * FROM users;

I tried the following calculations but it is not giving me accurate count. I have a calculated field called exceptions count in Tableau, where I am trying to get the count and this is what I have tried:

If ([test attribute] IN ('e1', 'e2') and year([assigned date]) = 2023) and if [test attribute] = 'e1' and year([assigned date]) IN (2022, 2021) then 1  else 0 end;

What is wrong?

Upvotes: 0

Views: 140

Answers (2)

Dale K
Dale K

Reputation: 27333

You can use a sub-query (CTE in my example) to calculate the error conditions then filter by them. And then use count(distinct id) to get the count.

with cte as (
    select id, [Name]
      , case when exists (select 1 from users c2 where c2.id = c1.id and year(assigned_date) = 2023 and training_category in ('E1','E2')) then 1 else 0 end Has2023Exception
      , case when exists (select 1 from users c2 where c2.id = c1.id and year(assigned_date) in (2021, 2022) and training_category in ('E1')) then 1 else 0 end Has2022or2021Exception
    from users c1
)
select count(distinct id)
from cte
where Has2023Exception = 1 and Has2022or2021Exception = 1;

Returns

Count
3

Or you could use the having clause to filter the rows, then count them e.g.

with cte as (
    select 1 Num
    from users
    group by id, [Name]
    having max(case when year(assigned_date) = 2023 and training_category in ('E1','E2') then 1 else 0 end) = 1
        and max(case when year(assigned_date) in (2021, 2022) and training_category in ('E1') then 1 else 0 end) = 1
)
select count(*)
from cte;

Which might not be much smaller or simpler, but performs better due to only a single access of the table.

Upvotes: 1

Paul Maxwell
Paul Maxwell

Reputation: 35603

SQL uses "case expressions" instead of if/else syntax (although you can use if/else in T-SQL) but the syntax of a case expression is something you should probably get used to if using SQL. The syntax is very similar to your stated requirement, e.g:

select
      case
        when [training category] = 'E1' and assigned_year = 2023 then 'overdue' 
        when [training category] = 'E2' and assigned_year = 2023 then 'late completion' 
        else '' 
      end as calc1
    , case
        when [training category] = 'E1' and assigned_year < 2023 then 'overdue' 
      end as calc2
   , *
from myusers
outer apply (select year(try_cast([assigned date] as date)) as assigned_year) oa1

see this working here

calc1 calc2 id name training category assigned date type assigned_year
overdue 102 Henry E1 2023-05-01 comp 2023
overdue 102 Henry E1 2022-01-30 comp 2022
late completion 102 Henry E2 2023-05-01 comp 2023
101 Mike X 2023-10-21 other 2023
overdue 101 Mike E1 2022-06-19 other 2022
late completion 100 Sam E2 2023-11-12 comp 2023
overdue 100 Sam E1 2021-09-18 other 2021
overdue 100 Sam E1 2023-11-12 other 2023
103 Tom X 2023-11-12 other 2023
overdue 104 Tom E1 2021-09-18 comp 2021
late completion 103 Tom E2 2023-11-12 other 2023

In your sample data you show the "assigned date" as a varchar, so I have included a way to cast that to a date. If the date is already a date column then the try_cast([assigned ddate] as date) isn't required.

Note: A case expression starts "top to bottom" so if the first "when" is true it ignores any following "when" or "else" conditions, if the first is false then it evaluates the next "when", and so on until finished (so it doesn't need "elseif" but can achieve the equivalent).

Small point: Whilst the correct term is "case expression" you will regrettably find many references to "case statement".

==================================================================================

I think you may be able to combine "calc1" and "calc2" by using "less than or equal" for the overdue conditions:

select
      case
        when [training category] = 'E1' and assigned_year <= 2023 then 'overdue' 
        when [training category] = 'E2' and assigned_year = 2023 then 'late completion' 
        else null 
      end as calc1
   , *
from myusers
outer apply (select year(try_cast([assigned date] as date)) as assigned_year) oa1

Upvotes: 0

Related Questions