wass rubleff
wass rubleff

Reputation: 376

JQ: How to split array by values and find out length of each piece?

I need to find out lengths of user sessions given timestamps of individual visits.

New session starts every time a delay between adjacent timestamps is longer than limit.

For example, for this set of timestamps (consider it sort of seconds from epoch):

[
  101,
  102,
  105,

  116,

  128,
  129,

  140,
  145,
  146,
  152
]

...and for value of limit=10, I need the following output:

[
  3,
  1,
  2,
  4
]

Upvotes: 1

Views: 552

Answers (2)

pmf
pmf

Reputation: 36326

Here's another approach using indices to calculate the breakpoint positions:

Producing the lengths of the segments:

10 as $limit
| [
    [0, indices(while(. != []; .[1:]) | select(.[0] + $limit <= .[1]))[] + 1, length]
    | .[range(length-1):] | .[1] - .[0]
  ]
[
  3,
  1,
  2,
  4
]

Demo

Producing the segments themselves:

10 as $limit
| [
    (
      [indices(while(. != []; .[1:]) | select(.[0] + $limit <= .[1]))[] + 1]
      | [null, .[0]], .[range(length):]
    )
    as [$a,$b] | .[$a:$b]
  ]
[
  [
    101,
    102,
    105
  ],
  [
    116
  ],
  [
    128,
    129
  ],
  [
    140,
    145,
    146,
    152
  ]
]

Demo

Upvotes: 1

Jeff Mercado
Jeff Mercado

Reputation: 134521

Assuming the values will be in ascending order, loop through the values accumulating the groups based on your condition. reduce works well in this case.

10 as $limit # remove this so you can feed in your value as an argument
    | reduce .[] as $i (
        {prev:.[0], group:[], result:[]};
        if ($i - .prev > $limit)
            then {prev:$i, group:[$i], result:(.result + [.group])}
            else {prev:$i, group:(.group + [$i]), result}
        end
    )
    | [(.result[], .group) | length]

If the difference from the previous value exceeds the limit, take the current group of values and move it to the result. Otherwise, the current value belongs to the current group so add it. At the end, you could count the sizes of the groups to get your result.


Here's a slightly modified version that just counts the values up.

10 as $limit
    | reduce .[] as $i (
        {prev:.[0], count:0, result:[]};
        if ($i - .prev > $limit)
            then {prev:$i, count:1, result:(.result + [.count])}
            else {prev:$i, count:(.count + 1), result}
        end
    )
    | [.result[], .count]

Upvotes: 2

Related Questions