Reputation: 1792
=> \d test_table;
Table "public.test_table"
Column | Type | Collation | Nullable | Default
-------------+---------+-----------+----------+---------
timestamp | bigint | | |
source | inet | | |
destination | inet | | |
type | integer | | |
Let's say there's a table with the same above, that contains the network connection information, with a lot more ip pairs, and types than shown below.
rawdb=> select * from test_table order by timestamp;
timestamp | source | destination | type
------------+-------------+-------------+------
1586940900 | 192.168.1.1 | 192.168.1.2 | 1
1586940960 | 192.168.1.1 | 192.168.1.2 | 1
1586941020 | 192.168.1.1 | 192.168.1.2 | 1
1586941080 | 192.168.1.1 | 192.168.1.2 | 1
1586941140 | 192.168.1.1 | 192.168.1.2 | 1
(5 rows)
Out of the table, I need to find the connection pair that are always connecting every x
interval. For example, in the rows above, the connection between the ip pair 192.168.1.1
to 192.168.1.2
is happening every 60s intervals.
The table above would be the answer to the question "how many ip pairs are connecting every 60s over the last 5 min?"
Question
How do I extract those periodic connections for various intervals with the same type, same ip pair? Connections that are 1 min peridoically, 5 min perodically, 30 min perodically.
The baseline is I can provide the x
to search for (e.g. every 60s, every 5 min, every 1 hour, etc), the best case solution is to be able to find x
without being provided.
The result format I need is the same as the table above.
Is it possible to do all of this in SQL? I have done some research on doing gap analysis on the postgres tables, but this is not to find out the gaps, but rather the continuous sequence.
Upvotes: 1
Views: 83
Reputation:
Not sure what exactly the output is, you are after. But the difference between two connections of the same source/destination combination can be calculated using a window function.
Something like:
select distinct source, destination
from (
select *,
lead("timestamp") over w - "timestamp" as diff
from test_table
window w as (partition by source, destination order by "timestamp")
) t
where diff = 60
lead("timestamp") over w - "timestamp"
calculates the difference between the current row's timestamp and the next one for the same source/destination pair. I moved the window definition into the FROM clause to make the expression that calculates the diff more readable.
Upvotes: 1