Reputation: 376
I need to find out lengths of user sessions given timestamps of individual visits.
New session starts every time a delay between adjacent timestamps is longer than limit
.
For example, for this set of timestamps (consider it sort of seconds from epoch):
[
101,
102,
105,
116,
128,
129,
140,
145,
146,
152
]
...and for value of limit=10
, I need the following output:
[
3,
1,
2,
4
]
Upvotes: 1
Views: 552
Reputation: 36326
Here's another approach using indices
to calculate the breakpoint positions:
Producing the lengths of the segments:
10 as $limit
| [
[0, indices(while(. != []; .[1:]) | select(.[0] + $limit <= .[1]))[] + 1, length]
| .[range(length-1):] | .[1] - .[0]
]
[
3,
1,
2,
4
]
Producing the segments themselves:
10 as $limit
| [
(
[indices(while(. != []; .[1:]) | select(.[0] + $limit <= .[1]))[] + 1]
| [null, .[0]], .[range(length):]
)
as [$a,$b] | .[$a:$b]
]
[
[
101,
102,
105
],
[
116
],
[
128,
129
],
[
140,
145,
146,
152
]
]
Upvotes: 1
Reputation: 134521
Assuming the values will be in ascending order, loop through the values accumulating the groups based on your condition. reduce
works well in this case.
10 as $limit # remove this so you can feed in your value as an argument
| reduce .[] as $i (
{prev:.[0], group:[], result:[]};
if ($i - .prev > $limit)
then {prev:$i, group:[$i], result:(.result + [.group])}
else {prev:$i, group:(.group + [$i]), result}
end
)
| [(.result[], .group) | length]
If the difference from the previous value exceeds the limit, take the current group of values and move it to the result. Otherwise, the current value belongs to the current group so add it. At the end, you could count the sizes of the groups to get your result.
Here's a slightly modified version that just counts the values up.
10 as $limit
| reduce .[] as $i (
{prev:.[0], count:0, result:[]};
if ($i - .prev > $limit)
then {prev:$i, count:1, result:(.result + [.count])}
else {prev:$i, count:(.count + 1), result}
end
)
| [.result[], .count]
Upvotes: 2