Reputation: 19897
How do I achieve parallel filtering in F#? Basically I want to do Array.Parallel.Choose
except with a sequence expression to create an Array.
I tried:
Async.Parallel [for x in 1..40 do yield async { if x % 5 = 0 then return x }]
but this is a type mismatch since the if doesn't always have a value (i.e. Async<unit>
isn't an Async<'a>
).
I'm iterating over a large set of numbers (1,000,000,000 +) so I don't want to generate the sequence upfront.
The actual check in the if statement is:
let isPalindrome (x : int) = let numberArray = x.ToString().ToCharArray()
numberArray = Array.rev numberArray
Tried using PSeq:
[for x in 990000..999999 do for y in 990000..999999 do yield (x, y, x*y)]
|> PSeq.filter(fun (x, y, z) -> isPalindrome z)
|> Seq.max
this results in an OutOfMemoryException
.
Ugly workaround:
let numberArray = [|990000..999999|]
let result = numberArray |> Array.Parallel.collect(fun x -> [| for y in numberArray do if isPalindrome (x*y) then yield (x, y, x*y)|])
|> Array.maxBy(fun (x, y, z) -> z)
Upvotes: 2
Views: 822
Reputation: 1172
A much more efficient solution based on the implementation of Array.Choose:
open System.Collections.Concurrent
open System.Threading.Tasks
open System
let parallelFilter f (array: 'T[]) =
let inputLength = array.Length
let isChosen : bool [] = Array.zeroCreate inputLength
let mutable outputLength = 0
let range = Partitioner.Create(0,inputLength)
Parallel.ForEach(
range,
(fun () ->0),
(fun (start,finish) _ count ->
let mutable localCount = 0
for i in start .. finish-1 do
match f array.[i] with
| true -> ()
| false ->
isChosen.[i] <- true
localCount <- localCount+1
localCount),
Action<int> (fun x -> System.Threading.Interlocked.Add(&outputLength,x) |> ignore )
) |> ignore
let output = Array.zeroCreate outputLength
let mutable curr = 0
for i = 0 to isChosen.Length-1 do
if isChosen.[i] then
output.[curr] <- output.[i]
curr <- curr + 1
output
Upvotes: 0
Reputation: 19897
let isPalindrome (x : int) = let numberArray = x.ToString().ToCharArray()
numberArray = Array.rev numberArray
I figured out how to use the PSeq
library to do lazy filtering on a sequence:
let generator =
seq {
for x in 990000..999999 do
for y in 990000..999999 do
yield (x, y, x*y)
}
generator
|> PSeq.filter(fun (x, y, z) -> isPalindrome z)
|> Seq.max
Real: 00:00:28.938, CPU: 00:01:49.078, GC gen0: 3448, gen1: 2, gen2: 0
Sadly, this was slower than my ugly hack:
let numberArray = [|990000..999999|]
let result = numberArray |> Array.Parallel.collect(fun x -> [| for y in numberArray do if isPalindrome (x*y) then yield (x, y, x*y)|])
|> Array.maxBy(fun (x, y, z) -> z)
Real: 00:00:24.779, CPU: 00:01:32.109, GC gen0: 2680, gen1: 18, gen2: 1
Upvotes: 1