ilovetolearn
ilovetolearn

Reputation: 2060

SQL Extract Events in the Middle of a Time Range

I am able to extract the login and logout time for event logs based on example here: SQL login/logout table

But I am facing some issues to determine if the user access Drive C and Drive D during his login, and the access is in between the login and logout time.

+-----+------------------+-------------------+----------+
| S/N |      Event       |     Timestamp     | Username |
+-----+------------------+-------------------+----------+
|   1 | Login            | 26 Jul 2018 19:35 | a        |
|   2 | Login            | 26 Jul 2018 20:00 | b        |
|   3 | Access Drive C   | 26 Jul 2018 20:30 | b        |
|   4 | Access Drive D   | 26 Jul 2018 20:30 | b        |
|   5 | Logout           | 26 Jul 2018 21:00 | b        |
|   6 | Login            | 26 Jul 2018 22:00 | c        |
|   7 | Login            | 26 Jul 2018 22:30 | c        |
|   8 | Access Service C | 26 Jul 2018 22:30 | c        |
|   9 | Logout           | 26 Jul 2018 23:00 | c        |
+-----+------------------+-------------------+----------+

+-----+-------------------+-------------------+----------+----------+---------+---------+
| S/N |       Login       |      Logout       | Username | Duration | Drive C | Drive D |
+-----+-------------------+-------------------+----------+----------+---------+---------+
|   1 | 26 Jul 2018 19:35 | NULL              | a        | NULL     | NULL    | NULL    |
|   2 | 26 Jul 2018 20:00 | 26 Jul 2018 21:00 | b        | 10min    | Y       | Y       |
|   3 | 26 Jul 2018 20:00 | 26 Jul 2018 20:30 | c        | 30min    | Y       | N       |
+-----+-------------------+-------------------+----------+----------+---------+---------+

Upvotes: 1

Views: 89

Answers (1)

Chris Saxon
Chris Saxon

Reputation: 9825

Presumably a user could login/out many times. And you're only looking for accesses within a given access period?

If you're using Oracle Database 12c you can use pattern matching to do this. This allows you to use regular expressions to search over your data.

login+ ( drive_c | drive_d )* logout{0,1}

Means:

  • Find one or more login events
  • Followed by either a drive C or D access, zero or more times
  • With an optional logout at the end

I don't understand where you're getting the durations of 10mins for user B and 30 for user C from. So I've shown the total time from the first login to the logout (if set). Which is this:

last ( logout.ts ) - first ( login.ts )

If you need something else, change the variable name before ts.

To check if they've accessed C or D, you need to see if either of these pattern variables were matched. Returning the count of this variable is one way to do this.

CREATE TABLE t (
  SN int, Event varchar(16), ts date, Username varchar(1)
);
alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';
INSERT INTO t VALUES  (1, 'Login', '2018-07-26 19:35:00', 'a');
insert into t values  (2, 'Login', '2018-07-26 20:00:00', 'b');
insert into t values  (3, 'Access Drive C', '2018-07-26 20:30:00', 'b');
insert into t values  (4, 'Access Drive D', '2018-07-26 20:30:00', 'b');
insert into t values  (5, 'Logout', '2018-07-26 21:00:00', 'b');
insert into t values  (6, 'Login', '2018-07-26 22:00:00', 'c');
insert into t values  (7, 'Login', '2018-07-26 22:30:00', 'c');
insert into t values  (8, 'Access Service C', '2018-07-26 22:30:00', 'c');
insert into t values  (9, 'Logout', '2018-07-26 23:00:00', 'c');

select * from t
match_recognize (
  partition by username
  order by ts
  measures
    first ( login.ts ) as login,
    last ( logout.ts ) as logout,
    round ( ( last ( logout.ts ) - first ( login.ts ) ) * 1440 ) as duration_minutes,
    case when count ( drive_c.ts ) > 0 then 'Y' else 'N' end as access_c,
    case when count ( drive_d.ts ) > 0 then 'Y' else 'N' end as access_d
  pattern ( login+ ( drive_c | drive_d )*  logout{0,1} )
  define 
    login as event = 'Login',
    drive_c as event like 'Access%C',
    drive_d as event like 'Access%D',
    logout as event = 'Logout'
);

U LOGIN               LOGOUT              DURATION_MINUTES A A
- ------------------- ------------------- ---------------- - -
a 2018-07-26 19:35:00                                      N N
b 2018-07-26 20:00:00 2018-07-26 21:00:00               60 Y Y
c 2018-07-26 22:00:00 2018-07-26 23:00:00               60 Y N

If you need something to work on older versions, you could do something like this:

  • Use first_value to find the time of the next logout after each login. Ensure you ignore nulls!
  • Do the same for accesses of C and D
  • Group by the username and next logout value to find the first login for each
  • Check if the timestamp values you've found for C & D using first_value are between the login/out times and report accordingly

Which looks like:

with next_logouts as (
  select t.*,
         first_value (
           case when event = 'Logout' then ts end
         ) ignore nulls over (
           partition by username
           order  by ts rows between current row and unbounded following 
         ) next_logout,
         first_value (
           case when event like 'Access%C' then ts end
         ) ignore nulls over (
           partition by username
           order  by ts rows between current row and unbounded following 
         ) access_c,
         first_value (
           case when event like 'Access%D' then ts end
         ) ignore nulls over (
           partition by username
           order  by ts rows between current row and unbounded following 
         ) access_d
  from   t
), periods as (
  select username, min ( ts ) login, next_logout logout,
         case
           when min ( access_c ) > min ( ts ) and min ( access_c ) < next_logout then 'Y'
           else 'N'
         end access_c,
         case
           when min ( access_d ) > min ( ts ) and min ( access_d ) < next_logout then 'Y'
           else 'N'
         end access_d
  from   next_logouts n
  group  by username, next_logout
)
  select p.*
  from   periods p
  order  by username, login;

USERNAME   LOGIN                 LOGOUT                ACCESS_C   ACCESS_D   
a          2018-07-26 19:35:00   <null>                N          N          
b          2018-07-26 20:00:00   2018-07-26 21:00:00   Y          Y          
c          2018-07-26 22:00:00   2018-07-26 23:00:00   Y          N   

Upvotes: 1

Related Questions