user7787630
user7787630

Reputation: 63

How to split a sequence in F# based on another sequence in an idiomatic way

I have, in F#, 2 sequences, each containing distinct integers, strictly in ascending order: listMaxes and numbers.

If not Seq.isEmpty numbers, then it is guaranteed that not Seq.isEmpty listMaxes and Seq.last listMaxes >= Seq.last numbers.

I would like to implement in F# a function that returns a list of list of integers, whose List.length equals Seq.length listMaxes, containing the elements of numbers divided in lists, where the elements of listMaxes limit each group.

For example: called with the arguments

listMaxes = seq [ 25; 56; 65; 75; 88 ]
numbers = seq [ 10; 11; 13; 16; 20; 25; 31; 38; 46; 55; 65; 76; 88 ]

this function should return

[ [10; 11; 13; 16; 20; 25]; [31; 38; 46; 55]; [65]; List.empty; [76; 88] ]

I can implement this function, iterating over numbers only once:

let groupByListMaxes listMaxes numbers = 
    if Seq.isEmpty numbers then
        List.replicate (Seq.length listMaxes) List.empty
    else
        List.ofSeq (seq {
            use nbe = numbers.GetEnumerator ()
            ignore (nbe.MoveNext ())
            for lmax in listMaxes do
                yield List.ofSeq (seq {
                    if nbe.Current <= lmax then
                        yield nbe.Current
                    while nbe.MoveNext () && nbe.Current <= lmax do
                        yield nbe.Current
                })
        })

But this code feels unclean, ugly, imperative, and very un-F#-y.

Is there any functional / F#-idiomatic way to achieve this?

Upvotes: 5

Views: 337

Answers (4)

Lars
Lars

Reputation: 466

I know this is an old question but I had a very similar problem and I think this is a simple solution:

let groupByListMaxes cs xs =
    List.scan (fun (_, xs) c -> List.partition (fun x -> x <= c) xs)
        ([], xs)
        cs
    |> List.skip 1
    |> List.map fst

Upvotes: 0

Matiasd
Matiasd

Reputation: 571

It can be done like this with pattern matching and tail recursion:

let groupByListMaxes listMaxes numbers =
    let rec inner acc numbers =
        function
        | [] -> acc |> List.rev
        | max::tail ->
            let taken = numbers |> Seq.takeWhile ((>=) max) |> List.ofSeq
            let n = taken |> List.length
            inner (taken::acc) (numbers |> Seq.skip n) tail

    inner [] numbers (listMaxes |> List.ofSeq)

Update: I also got inspired by fold and came up with the following solution that strictly refrains from converting the input sequences.

let groupByListMaxes maxes numbers =
    let rec inner (acc, (cur, numbers)) max = 
        match numbers |> Seq.tryHead with
        // Add n to the current list of n's less
        // than the local max
        | Some n when n <= max ->
            let remaining = numbers |> Seq.tail  
            inner (acc, (n::cur, remaining)) max
        // Complete the current list by adding it
        // to the accumulated result and prepare
        // the next list for fold.
        | _ ->
            (List.rev cur)::acc, ([], numbers)

    maxes |> Seq.fold inner ([], ([], numbers)) |> fst |> List.rev

Upvotes: 2

user7787630
user7787630

Reputation: 63

I have found a better implementation myself. Tips for improvements are still welcome.

Dealing with 2 sequences is really a pain. And I really do want to iterate over numbers only once without turning that sequence into a list. But then I realized that turning listMaxes (generally the shorter of the sequences) is less costly. That way only 1 sequence remains, and I can use Seq.fold over numbers.

What should be the state that we want to keep and change while iterating with Seq.fold over numbers? First, it should definitely include the remaining of the listMaxes, yet the previous maxes that we already have surpassed are no longer of interest. Second, the accumulated lists so far, although, like in the other answers, these can be kept in reverse order. More to the point: the state is a couple which has as second element a reversed list of reversed lists of the numbers so far.

let groupByListMaxes listMaxes numbers =
    let rec folder state number =
        match state with
            | m :: maxes, _ when number > m ->
                folder (maxes, List.empty :: snd state) number
            | m :: maxes, [] ->
                fst state, List.singleton (List.singleton number)
            | m :: maxes, h :: t ->
                fst state, (number :: h) :: t
            | [], _ ->
                failwith "Guaranteed not to happen"
    let listMaxesList = List.ofSeq listMaxes
    let initialState = listMaxesList, List.empty
    let reversed = snd (Seq.fold folder initialState numbers)
    let temp = List.rev (List.map List.rev reversed)
    let extraLength = List.length listMaxesList - List.length temp
    let extra = List.replicate extraLength List.empty
    List.concat [temp; extra]

Upvotes: 1

Jake Lishman
Jake Lishman

Reputation: 927

Here's a version based on list interpretation, which is quite functional in style. You can use Seq.toList to convert between them, whenever you want to handle that. You could also use Seq.scan in conjunction with Seq.partition ((>=) max) if you want to use only library functions, but beware that it's very very easy to introduce a quadratic complexity in either computation or memory when doing that.

This is linear in both:

let splitAt value lst =
    let rec loop l1 = function
        | []                    -> List.rev l1, []
        | h :: t when h > value -> List.rev l1, (h :: t)
        | h :: t                -> loop (h :: l1) t
    loop [] lst

let groupByListMaxes listMaxes numbers =
    let rec loop acc lst = function
        | []     -> List.rev acc
        | h :: t ->
            let out, lst' = splitAt h lst
            loop (out :: acc) lst' t
    loop [] numbers listMaxes

Upvotes: 3

Related Questions