Alex Zhukovskiy
Alex Zhukovskiy

Reputation: 10015

How do I run parallel foreach loop?

I'm trying to run some script in Powershell Core (no workflow, no -Parallel option for ForEach).

So I'm trying to split my array in batches and run them at parallel. So I do:

$iterCount = 150000;
$threadCount = 8;
$batchSize = $iterCount/$threadCount;

$block = {
    Param($range)

    Foreach ($i in $range) {
        ...
    }
}

For ($i = 0; $i -lt 150000; $i += $batchSize) {
    Start-Job -Scriptblock $block -ArgumentList $i..$i+$batchSize
}

But when I call it I get

Start-Job : Cannot bind parameter 'InitializationScript'. Cannot convert the "..                                        0+18750" value of type "System.String" to type "System.Management.Automation.Scr                                        iptBlock".
At /home/tchain/dit/push_messages3.ps1:63 char:48
+     Start-Job -Scriptblock $block -ArgumentList $i..$i+$batchSize
+                                                   ~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidArgument: (:) [Start-Job], ParameterBindingExce                                        ption
+ FullyQualifiedErrorId : CannotConvertArgumentNoMessage,Microsoft.PowerShell.Co                                        mmands.StartJobCommand

It seems that ArgumentList stringifies everything so I cannot pass a range.

Is there a way to pass strongly typed range? Is there any better way to parallelize loop? I'd like to write (0..150000).AsParallel().ForEach($i => ...) But it seems that I can't.

I did Param([int] $from, [int] $to) as a workaround, but I'm not sure if it's the best I can do.

Upvotes: 0

Views: 2830

Answers (2)

js2010
js2010

Reputation: 27423

The range operator ".." literally creates an array of that many integers. Just like ""$array += element", "1..$highnumber" can use a lot of memory. A for loop should work fine in the job. You also never use $iterCount.

Also note that jobs use new processes. But you can use start-threadjob instead in PS 6, to use threads.

#$iterCount = 150000;                                                                                                                                 
$iterCount = 24
$threadCount = 8;
$batchSize = $iterCount/$threadCount;

$block = {
    Param($start,$range)
    "start $start range $range"
    For ($i = $start; $i -lt $range; $i++) {
        $i 
    }
}

For ($i = 0; $i -lt $iterCount; $i += $batchSize) {
    Start-Job -Scriptblock $block -ArgumentList $i,($i+$batchSize)
}


start 0 range 3
0
1
2
start 3 range 6
3
4
5
start 6 range 9
6
7
8
start 9 range 12
9
10
11
start 12 range 15
12
13
14
start 15 range 18
15
16
17
start 18 range 21
18
19
20
start 21 range 24
21
22
23

For purposes of comparison, actually creating the range ".." within the scriptblock. Output from receive-job may appear out of order.

$iterCount = 150000                                                                   
$threadCount = 8
$batchSize = $iterCount/$threadCount

$block = {
    Param($start,$range)
    "start $start range $range"                                      
    Foreach ($i in $start..($range-1)) {
        # $i                                                                          
    }
}

For ($i = 0; $i -lt $iterCount; $i += $batchSize) {
    Start-Job -Scriptblock $block -ArgumentList $i,($i+$batchSize)
}


start 0 range 18750
start 18750 range 37500
start 75000 range 93750
start 131250 range 150000
start 93750 range 112500
start 37500 range 56250
start 56250 range 75000
start 112500 range 131250

Upvotes: 0

vonPryz
vonPryz

Reputation: 24071

Instead of evaluating a range operator in argument list, use temp variables to create an array of desired size. Then pass the array as an argument. Like so,

For ($i = 0; $i -lt 150000; $i += $batchSize) {
    $j = $i+$batchSize
    $range = $i..$j
    Start-Job -Scriptblock $block -ArgumentList (,$range)
}

Edit: -ArgumentList unravels the array, so a bit trickery is needed.

Test code via printing details about passed array:

$block = {
  Param([array]$range)
  write-host "len`t[0]`t[-1]"
  write-host $range.length"`t"$range[0]"`t"$range[-1]
}

For ($i = 0; $i -lt 150000; $i += $batchSize) {
    $j = $i+$batchSize
    $range = $i..$j
    Start-Job -Scriptblock $block -ArgumentList (,$range)
}

get-job | receive-job
len     [0]     [-1]
18751    0       18750
len     [0]     [-1]
18751    18750   37500
len     [0]     [-1]
18751    37500   56250
len     [0]     [-1]
18751    56250   75000
len     [0]     [-1]
18751    75000   93750
len     [0]     [-1]
18751    93750   112500
len     [0]     [-1]
18751    112500          131250
len     [0]     [-1]
18751    131250          150000

Upvotes: 1

Related Questions