ct14
ct14

Reputation: 77

SQL Server : window function to avoid duplicate rows

I need your help please with my problem below. I have data in a SQL Server database that shows whether a user has logged in a security system or not.

The structure of the data is as following:

id SystemId    Time         PC   User   Logged
-----------------------------------------------
1    2      20171025 16:01   1   John    In
2    2      20171025 17:41   4   Mike    In
3    2      20171025 17:58   1   John    Out
4    2      20171025 17:58   4   Mike    Out
5    2      20171025 19:21   4   Peter   In
6    2      20171025 21:45   4   Peter   In
7    2      20171025 22:59   4   Peter   Out

I need to develop a script that runs every 30 minutes and must show when there is no user logged in the security monitoring systems.

The below script does the job with the exception where we have system/network issues and users are forcibly logged out and in again without the forced log off being recorded in the database. i.e. User logged out was not recorded in row 6 but instead only the logged in was.

This means that when I run my script yesterday at 23:30 to check if Peter (in the example) was logged in and doing his job, the result came back showing that Peter was logged in, when in fact he wasn't. (See row 6)

I believe my solution is a window function (partition by) that skips any consecutive rows that have identical values in column (PC, User, Logged). This way I will avoid the double counting. But I don't know whether I am at the right direction or not.

SELECT 
    SystemId,
    SUM(CASE
           WHEN Logged = 'In' THEN 1
           WHEN Logged = 'Out' THEN -1
           ELSE 0
        END) AS Connected
FROM 
    log.SecurityLogs
GROUP BY 
    SystemId
HAVING 
    SUM(CASE
           WHEN Logged = 'In' THEN 1
           WHEN Logged = 'Out' THEN -1
           ELSE 0
        END) = 0

Upvotes: 1

Views: 773

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270463

The state of the user changes between In and Out. From what you descrdibe, there are no nested states (i.e., you wouldn't expect In-->In-->Out-->Out).

So, just compare the maximum times:

select sl.SystemId,
       max(case when logged = 'In' then time end) as in_time,
       max(case when logged = 'Out' then time end) as out_time
from log.SecurityLogs sl
group by sl.SystemId;

If you want those whose most recent event was "in", then:

select sl.SystemId,
       max(case when logged = 'In' then time end) as in_time,
       max(case when logged = 'Out' then time end) as out_time
from log.SecurityLogs sl
group by sl.SystemId
having max(case when logged = 'In' then time end) > max(case when logged = 'Out' then time end);

Upvotes: 1

Related Questions