Reputation: 31
I have the following table that registers the days students went to school. I need to count the days they are PRESENT, but also need to count the total of school days for each month. (When ASISTENCIA is either 0 or 1)
This what I have so far, but it doesn't count the total.
BEGIN
SELECT
u.user_id user_id,
u.user_first_name as names,
u.user_last_name_01 as lastname1,
u.user_last_name_02 as lastname2,
MONTH(a.FECHA_ASISTENCIA) month,
COUNT(*) as absent_days,
p.PHONE as phone,
p.CITY as city,
@EDUCATION_LEVEL_ID
FROM
users u
inner join asistencia a ON u.user_id = a.USER_ID
inner join profile p ON u.rut_SF = p.RUT_SF
WHERE
a.ASISTENCIA = 0 -- NOT PRESENT
AND a.EDUCATION_LEVEL_ID = @EDUCATION_LEVEL_ID
AND YEAR(a.FECHA_ASISTENCIA) = @EDUCATION_LEVEL_YEAR
GROUP BY
u.user_id,
u.user_first_name,
u.user_last_name_01,
u.user_last_name_02,
MONTH(a.FECHA_ASISTENCIA),
p.TELEFONO,
p.CIUDAD_DOM
ORDER BY mes
END
ATTENDANCE
USER_ID | DATE | ATTENDANCE | EDUCATION_LEVEL_ID |
---|---|---|---|
123 | 2021-04-13 | 0 | 1 |
123 | 2021-04-14 | 1 | 1 |
DESIRED OUTPUT
names | lastname1 | lastname2 | month | absent_days | total_class_days | city |
---|---|---|---|---|---|---|
JOHN | SMITH | SMITH | 3 | 10 | 24 | CITY |
JOHN | SMITH | SMITH | 4 | 8 | 24 | CITY |
Upvotes: -1
Views: 168
Reputation: 1601
Without examples of what these tables look like, it is hard to give you a solid answer. However, it appears as though your biggest challenge is that ABSTENTIA does not exist.
This is a common problem for analysis - you need to create rows that do not exist (when the user was absent).
The general approach is to:
You can accomplish this with some SQL that looks like this:
WITH GLOBAL_SPINE AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NULL) as INTERVAL_ID,
DATEADD('DAY', (INTERVAL_ID - 1), '2021-01-01'::timestamp_ntz) as SPINE_START,
DATEADD('DAY', INTERVAL_ID, '2022-01-01'::timestamp_ntz) as SPINE_END
FROM TABLE (GENERATOR(ROWCOUNT => 365))
),
GROUPS AS (
SELECT
USERID,
MIN(DESIRED_INTERVAL) AS LOCAL_START,
MAX(DESIRED_INTERVAL) AS LOCAL_END
FROM RASGO.PUBLIC.RASGO_SDK__OP4__AGGREGATE_TRANSFORM__8AB1FEDF90
GROUP BY
USERID
),
GROUP_SPINE AS (
SELECT
USERID,
SPINE_START AS GROUP_START,
SPINE_END AS GROUP_END
FROM GROUPS G
CROSS JOIN LATERAL (
SELECT
SPINE_START, SPINE_END
FROM GLOBAL_SPINE S
WHERE S.SPINE_START BETWEEN G.LOCAL_START AND G.LOCAL_END
)
)
SELECT
G.USERID AS GROUP_BY_USERID,
GROUP_START,
GROUP_END,
T.*
FROM GROUP_SPINE G
LEFT JOIN {{ your_table }} T
ON DESIRED_INTERVAL >= G.GROUP_START
AND DESIRED_INTERVAL < G.GROUP_END
AND G.USERID = T.USERID;
The above script works on Snowflake, but the syntax might be slightly different depending on your RDBMS. There are also some other tweaks you can make regarding when you insert the blank rows, I chose 'local' which means that we begin inserting rows for each user on their very first day. You could change this to global if you wanted to populate data from every single day between 1/1/2021 and 1/1/2022.
I used Rasgo to generate this SQL
Upvotes: 0