Reputation: 41
I have a table called project_errors
which has the columns project_id
, total_errors
and date
. So everyday a batch job runs which inserts a row with the number of errors for a particular project on a given day.
Now I want to know how many errors were reduced and how many errors was introduced for a given month for a project. I thought of a solution of creating trigger after insert which will record if the errors have increased or decreased and put it to another table. But this will not work for previously inserted data. Is there any other way I can do this? I researched about the lag function but not sure how to do this for my problem. The Table structure is given below.
Project_Id Total_Errors Row_Insert_Date
1 56 08-MAR-14
2 14 08-MAR-14
3 89 08-MAR-14
1 54 07-MAR-14
2 7 07-MAR-14
3 80 07-MAR-14
And so on...
Upvotes: 0
Views: 72
Reputation: 213
In addition to Justin's answer, you may want to consider changing your table structure. Instead of recording only totals, you can record the actual errors, then count them.
So suppose you had a table structure like:
CREATE TABLE PROJECT_ERRORS(
project_id INTEGER
error_id INTEGER
stamp DATETIME
)
Each record would be a separate error (or separate error type), and this would give you more granularity and allow more complex queries.
You could still get your totals by day with:
SELECT project_id, COUNT(error_id), TO_CHAR(stamp, 'DD-MON-YY') AS EACH_DAY
FROM PROJECT_ERRORS
GROUP BY project_id, TO_CHAR(stamp, 'DD-MON-YY')
And if we combine this with JUSTIN'S AWESOME ANSWER:
SELECT
project_id AS PROJECT_ID,
COUNT(error_id) AS TOTAL_ERRORS,
LAG(COUNT(error_id))
OVER(PARTITION BY project_id
ORDER BY TO_CHAR(stamp, 'DD-MON-YY')) AS prior_num_errors,
COUNT(error_id) - LAG(COUNT(error_id))
OVER(PARTITION BY project_id
ORDER BY TO_CHAR(stamp, 'DD-MON-YY') ) AS diff
FROM project_errors
GROUP BY
project_id,
TO_CHAR(stamp, 'DD-MON-YY')
But now you can also get fancy and look for specific types of errors or look during certain times of the day.
Upvotes: 1
Reputation: 11375
Credits to Justin. Just thought of pasting the results.
WITH TEMP
AS (SELECT 1 AS PROJECT_ID,
56 AS TOTAL_ERRORS,
'08-MAR-14' AS ROW_INSERT_DATE
FROM DUAL
UNION ALL
SELECT 2 AS PROJECT_ID,
14 AS TOTAL_ERRORS,
'08-MAR-14' AS ROW_INSERT_DATE
FROM DUAL
UNION ALL
SELECT 3 AS PROJECT_ID,
89 AS TOTAL_ERRORS,
'08-MAR-14' AS ROW_INSERT_DATE
FROM DUAL
UNION ALL
SELECT 1 AS PROJECT_ID,
54 AS TOTAL_ERRORS,
'07-MAR-14' AS ROW_INSERT_DATE
FROM DUAL
UNION ALL
SELECT 2 AS PROJECT_ID,
7 AS TOTAL_ERRORS,
'07-MAR-14' AS ROW_INSERT_DATE
FROM DUAL
UNION ALL
SELECT 3 AS PROJECT_ID,
80 AS TOTAL_ERRORS,
'07-MAR-14' AS ROW_INSERT_DATE
FROM DUAL)
SELECT PROJECT_ID,
TOTAL_ERRORS,
NVL (
TOTAL_ERRORS
- LAG ( TOTAL_ERRORS )
OVER (PARTITION BY PROJECT_ID ORDER BY ROW_INSERT_DATE),
0 )
CHANGES,
ROW_INSERT_DATE
FROM TEMP;
PROJECT_ID TOTAL_ERRORS CHANGES ROW_INSERT_DATE
---------- ------------ ---------- ---------------
1 54 0 07-MAR-14
1 56 2 08-MAR-14
2 7 0 07-MAR-14
2 14 7 08-MAR-14
3 80 0 07-MAR-14
3 89 9 08-MAR-14
6 rows selected.
Upvotes: 0
Reputation: 231851
It's always helpful if you can show the output that you want. My guess is that you want to subtract 54 from 56 and show that 2 errors were added on project 1, subtract 7 from 14 to show that 7 errors were added on project 2, and subtract 80 from 89 to show that 9 errors were added on project 3. Assuming that is the case
SELECT project_id,
total_errors,
lag( total_errors ) over( partition by project_id
order by row_insert_date ) prior_num_errors,
total_errors -
lag( total_errors ) over( partition by project_id
order by row_insert_date ) difference
FROM table_name
You may need to throw an NVL
around the LAG
if you want the prior_num_errors
to be 0 on the first day.
Upvotes: 3