Reputation: 1560
I'm running List.scan over a very large list in order to compute a running total. When I'm done I need the total in addition to the scan output in order to partition the list non-uniformly. The total is in the last state output by scan, and I would really like to avoid an additional traversal of the list in order to get the final state. The only way I can think to do this is to pass a mutable reference to accumulate the total. Is there a better way to approach this?
let l = <very large list of int64>
let runningTotal=List.scan (fun s x -> x+s) 0L l
let total= <last element of runningTotal- very inefficient>
doSomething total runningTotal
Upvotes: 0
Views: 551
Reputation: 16792
In F# 4.0, List.mapFold
is being added, which enables this nicely.
[1;2;3;4] |> List.mapFold (fun state elem -> let nxt = state + elem in (nxt,nxt)) 0
// > val it : int list * int = ([1; 3; 6; 10], 10)
List.last
is also added in 4.0, though its perf is still O(n). If you want to pluck the last element from a list in F# 3.1 and earlier, you can do it with fold
, but again, this is O(n).
let last lst =
lst |> List.fold (fun _ x -> x) Unchecked.defaultof<_>
@John's solution is probably fastest and simplest.
Upvotes: 5
Reputation: 25516
Here is one way. Since we can define the lambda to do anything, just make it always store the result in a ref cell. Since scan works from start to end, the result will be the last value.
let last = ref 0L
let l = <very large list of int64>
let runningTotal=List.scan (fun s x ->let t = x+s;last=:t;t) 0L l
let total= !last
doSomething total runningTotal
Upvotes: 3
Reputation: 3838
I think actually just accessing the last element in a list, is indeed not possible. That said, you say, your list is very large. When it comes to very large inputs, a list might not be the optimal data structure to begin with. What comes to mind, would be that you certainly could use an array instead of a list in this case. Arrays are also more memory-efficient than lists, because a list will create a reference for each element, which is somewhere around 12 bytes per item. Whereas an array only has a reference to the first element.
If an array works for you, then that would be the solution, as you can access the last element of an array without the O(n) overhead.
Upvotes: 1