yanta
yanta

Reputation: 841

F# split sequence into sub lists on every nth element

Say I have a sequence of 100 elements. Every 10th element I want a new list of the previous 10 elements. In this case I will end up with a list of 10 sublists.

Seq.take(10) looks promising, how can I repeatedly call it to return a list of lists?

Upvotes: 15

Views: 5231

Answers (8)

anvish
anvish

Reputation: 515

now there's Seq.chunkBySize available:

[1;2;3;4;5] |> Seq.chunkBySize 2 = seq [[|1; 2|]; [|3; 4|]; [|5|]]

Upvotes: 27

Rick Bradford
Rick Bradford

Reputation: 23

I found this to be easily the fastest:

let windowChunk n xs =
    let range = [0 .. Seq.length xs]
    Seq.windowed n xs |> Seq.zip range 
                      |> Seq.filter (fun d -> (fst d) % n = 0)
                      |> Seq.map(fun x -> (snd x))

i.e. window the list, zip with a list of integers, remove all the overlapping elements, and then drop the integer portion of the tuple.

Upvotes: 1

Søren Debois
Søren Debois

Reputation: 5688

If in doubt, use fold.

let split n  = let one, append, empty = Seq.singleton, Seq.append, Seq.empty
               Seq.fold (fun (m, cur, acc) x -> 
                           if m = n then (1, one x, append acc (one cur))
                           else (m+1, append cur (one x), acc))
                        (0, empty, empty)
               >> fun (_, cur, acc) -> append acc (one cur)

This has the advantage of being fully functional, yet touch each element of the input sequence only once(*) (as opposed to the Seq.take + Seq.skip solutions proposed above).

(*) Assuming O(1) Seq.append. I should certainly hope so.

Upvotes: 1

vis
vis

Reputation: 2279

Perhaps this simple pure implementation might be useful:

let splitAt n xs =  (Seq.truncate n xs, if Seq.length xs < n then Seq.empty else Seq.skip n xs)

let rec chunk n xs =
    if Seq.isEmpty xs then Seq.empty
    else
        let (ys,zs) = splitAt n xs
        Seq.append (Seq.singleton ys) (chunk n zs)

For example:

> chunk 10 [1..100];;
val it : seq<seq<int>> =
  seq
    [seq [1; 2; 3; 4; ...]; seq [11; 12; 13; 14; ...]; 
     seq [21; 22; 23; 24; ...]; seq [31; 32; 33; 34; ...]; ...]

> chunk 5 [1..12];;
val it : seq<seq<int>> =
  seq [seq [1; 2; 3; 4; ...]; seq [6; 7; 8; 9; ...]; seq [11; 12]]

Upvotes: 1

Stephen Swensen
Stephen Swensen

Reputation: 22297

I have an evolution of three solutions. None of them preserves the ordering of the input elements, which is hopefully OK.

My first solution is quite ugly (making use of ref cells):

//[[4; 3; 2; 1; 0]; [9; 8; 7; 6; 5]; [14; 13; 12; 11; 10]; [17; 16; 15]]
let solution1 =
    let split s n =
        let i = ref 0
        let lst = ref []
        seq {
            for item in s do
                if !i = n then
                    yield !lst
                    lst := [item]
                    i := 1
                else
                    lst := item::(!lst)
                    i := !i+1
            yield !lst
        } |> Seq.toList
    split {0..17} 5

My second solution factors out the use of ref cells in the first solution, but consequently forces the use of direct IEnumerator access (push in one side, pop out the other)!

//[[17; 16; 15]; [14; 13; 12; 11; 10]; [9; 8; 7; 6; 5]; [4; 3; 2; 1; 0]]
let solution2 =
    let split (s:seq<_>) n =
        let e = s.GetEnumerator()
        let rec each lstlst lst i =
            if e.MoveNext() |> not then
                lst::lstlst
            elif i = n then
                each (lst::lstlst) [e.Current] 1
            else 
                each lstlst ((e.Current)::lst) (i+1)
        each [] [] 0
    split {0..17} 5

My third solution is based on the second solution except it "cheats" by taking a list as input instead of a seq, which enables the most elegant solution using pattern matching as Tomas points out is lacking with seq (which is why we were forced to use direct IEnumerator access).

//[[17; 16; 15]; [14; 13; 12; 11; 10]; [9; 8; 7; 6; 5]; [4; 3; 2; 1; 0]]
let solution3 =
    let split inputList n =
        let rec each inputList lstlst lst i =
            match inputList with
            | [] -> (lst::lstlst)
            | cur::inputList ->
                if i = n then
                    each inputList (lst::lstlst) [cur] 1    
                else
                    each inputList lstlst (cur::lst) (i+1)
        each inputList [] [] 0
    split [0..17] 5

If preserving the ordering of the elements is important, you can use List.rev for this purpose. For example, in solution2, change the last line of the split function to:

each [] [] 0 |> List.rev |> List.map List.rev

Upvotes: 2

Tomas Petricek
Tomas Petricek

Reputation: 243041

I think that the solution from Brian is probably the most reasonable simple option. A probelm with sequences is that they cannot be easily processed with the usual pattern matching (like functional lists). One option to avoid that would be to use LazyList from F# PowerPack.

Another option is to define a computation builder for working with IEnumerator type. I wrote something like that recently - you can get it here. Then you can write something like:

let splitEach chunkSize (s:seq<_>) =
  Enumerator.toSeq (fun () -> 
    let en = s.GetEnumerator()
    let rec loop n acc = iter {
      let! item = en
      match item with 
      | Some(item) when n = 1 -> 
          yield item::acc |> List.rev 
          yield! loop chunkSize []
      | Some(item) -> 
          yield! loop (n - 1) (item::acc)
      | None -> yield acc |> List.rev }
    loop chunkSize [] )

This enables using some functional patterns for list processing - most notably, you can write this as a usual recursive function (similar to the one you would write for lists/lazy lists), but it is imperative under the cover (the let! constructo of iter takes the next element and modifies the enumerator).

Upvotes: 0

Out of the top of my head:

let rec split size list =
if List.length list < size then
    [list]
else
    (list |> Seq.take size |> Seq.toList) :: (list |> Seq.skip size |> Seq.toList |> split size)

Upvotes: 1

Brian
Brian

Reputation: 118865

This is not bad:

let splitEach n s =
    seq {
        let r = ResizeArray<_>()
        for x in s do
            r.Add(x)
            if r.Count = n then
                yield r.ToArray()
                r.Clear()
        if r.Count <> 0 then
            yield r.ToArray()
    }

let s = splitEach 5 [1..17]
for a in s do
    printfn "%A" a
(*
[|1; 2; 3; 4; 5|]
[|6; 7; 8; 9; 10|]
[|11; 12; 13; 14; 15|]
[|16; 17|]
*)

Upvotes: 10

Related Questions