KizyMatzen
KizyMatzen

Reputation: 31

I need to count the number of 'active' employees by month, but some employees have more than one job. How can I count the active employees only once?

I need to calculate the number of employees by month. As you can see in the table bellow my data have five columns, one that have employee job keys, one that have employee id, one that have the status( start, end), one that shows the date, and the fifth column is the result of the calculation rolling headcount.

enter image description here

rolling headcount formula:

IF [ACTION]= "Start"

THEN 1

ELSEIF [ACTION] = "End" AND NOT ISNULL([TIME])

THEN - 1

END

The issue I'm having is some employees have more than one job, so they have more than one

start date and end date, but I need to count the employee as active only once.

The data I use is from a Oracle database, however I created a custom SQL query because the way the data was organized I wasn't able to count the number of employees.

This is the query:

Select EMPLOYEE_ID, SUPERVIOR_ID, POSITION_NUMBER, REPORTS_TO, JOB_OPERATING_UNIT, JOB_ACCOUNT, JOB_CLASS, EMPLOYEE_GRADE, EMPLOYEE_CLASS, JOB_STATUS, SYNC_DATE, EMPLOYEE_RECORD_NUMBER , 'Start' as Action , JOB_OBSERVED_START as Time From EMPLOYEE_JOB_FACT e

Union ALL Select EMPLOYEE_ID, SUPERVIOR_ID, POSITION_NUMBER, REPORTS_TO, JOB_OPERATING_UNIT, JOB_ACCOUNT, JOB_CLASS, EMPLOYEE_GRADE, EMPLOYEE_CLASS, JOB_STATUS, SYNC_DATE, EMPLOYEE_RECORD_NUMBER , 'End' as Action , job_observed_end as Time From EMPLOYEE_JOB_FACT

The original data has one column for date start and one for date end.

Any ideas to solve this problem?

Thank you!

Upvotes: 0

Views: 1155

Answers (1)

user18098820
user18098820

Reputation:

Here are 2 different ways of doing this.

  • In the first we add a virtual column to the table, using a case statement with the same logic as you have given in the question. We then use this column in a query with GROUP BY to retain one line per employee, checking the date in the WHERE to know who has a job at the given date.
  • In the second query we use A CTE. This solution needs completing with a date filter if you can use the first method.
    You could also consider recording start and end as 1 and -1 in the action column with type int.
Alter table Employee_Job_Fact
add column StartEnd int generated always 
as (case
when Action = 'start' then 1
when Action = 'end' then -1
else 0 end) stored;
SELECT
  Employee_id,
  SUM(StartEnd) startEnd,
  MAX(`TIME`) "last Time"
FROM
  Employee_Job_Fact
WHERE
  `TIME` <= '2022-03-17 00:00:00'
GROUP BY
  Employee_id
HAVING
  SUM(StartEnd) > 0
ORDER BY
  Employee_ID;
Employee_id | startEnd | last Time          
----------: | -------: | :------------------
   40955848 |        2 | 2022-03-05 00:00:00
  104957320 |        1 | 2022-03-12 00:00:00
  678570955 |        1 | 2022-03-05 00:00:00
  773731135 |        1 | 2022-03-02 00:00:00
  820688756 |        1 | 2022-03-17 00:00:00
  951301729 |        1 | 2022-03-17 00:00:00
with status as (
select
  employee_id id,
  sum(case when action = 'start' then 1
  when action = 'end' then -1 end) val
from
  Employee_Job_Fact
group by
  employee_id )
select
  id
from status
where val > 0
order by id;
|        id |
| --------: |
|  40955848 |
|  44969527 |
| 104957320 |
| 121189727 |
| 678570955 |
| 773731135 |
| 820688756 |
| 951301729 |

db<>fiddle here

Upvotes: 0

Related Questions