How can I select rows into N groups per value of a certain column?

Question

I have a table in the form

Span     Available     Time
A            0          0
B            1          0
C            1          0
A            1          1
B            0          1
C            1          1
...         ...        ...
A            1          N
B            0          N
C            1          N

I want to group this into groups of X Times per Span. So it would look like:

Span     Available     Time
A            1           0
A            0           1
...         ...         ...
A            1           X
B            1           0
B            1           1
...         ...         ...
B            0           X
C            0           0
C            1           1
...         ...         ...
C            0           X
A            1          X+1
A            0          X+2
...         ...         ...
A            1          2X
B            1          X+1
B            1          X+2
...         ...         ...
B            0           2X
...         ...         ...
...         ...         ...
A            0          N-X
A            1          N-X+1
...         ...         ...
A            0           N
B            1          N-X
B            0          N-X+1
...         ...         ...
B            1           N
C            0          N-X
C            1          N-X+1
...         ...         ...
C            1           N

Where X is a factor of N.

How can I group the data in this way using SQL or Spark's DataFrame API?

Also, how can I aggregate that table by X rows per span to get, for example, the percentage availability for the span from time 0 to X, X to 2X, etc.?

edit:

For context, each group of X rows represents a day, and the whole data set represents a week. So I want to aggregate the availability per day, per span.

edit:

Also, I know what X is. So I want to be able to say something like GROUP BY Span LIMIT X ORDER BY Time

edit:

As a final attempt to describe this better, I want the first X of the first span, then the first X of the next span, and then the first X of the last span, followed by the next X of the first span, the next X of the second span, etc., through to the last rows for each span.

How can I select rows into N groups per value of a certain column?

Answers (1)

Related Questions