Reputation: 10015
I'm trying to run some script in Powershell Core (no workflow, no -Parallel option for ForEach).
So I'm trying to split my array in batches and run them at parallel. So I do:
$iterCount = 150000;
$threadCount = 8;
$batchSize = $iterCount/$threadCount;
$block = {
Param($range)
Foreach ($i in $range) {
...
}
}
For ($i = 0; $i -lt 150000; $i += $batchSize) {
Start-Job -Scriptblock $block -ArgumentList $i..$i+$batchSize
}
But when I call it I get
Start-Job : Cannot bind parameter 'InitializationScript'. Cannot convert the ".. 0+18750" value of type "System.String" to type "System.Management.Automation.Scr iptBlock".
At /home/tchain/dit/push_messages3.ps1:63 char:48
+ Start-Job -Scriptblock $block -ArgumentList $i..$i+$batchSize
+ ~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidArgument: (:) [Start-Job], ParameterBindingExce ption
+ FullyQualifiedErrorId : CannotConvertArgumentNoMessage,Microsoft.PowerShell.Co mmands.StartJobCommand
It seems that ArgumentList
stringifies everything so I cannot pass a range.
Is there a way to pass strongly typed range? Is there any better way to parallelize loop? I'd like to write (0..150000).AsParallel().ForEach($i => ...)
But it seems that I can't.
I did Param([int] $from, [int] $to)
as a workaround, but I'm not sure if it's the best I can do.
Upvotes: 0
Views: 2830
Reputation: 27423
The range operator ".." literally creates an array of that many integers. Just like ""$array += element", "1..$highnumber" can use a lot of memory. A for loop should work fine in the job. You also never use $iterCount.
Also note that jobs use new processes. But you can use start-threadjob instead in PS 6, to use threads.
#$iterCount = 150000;
$iterCount = 24
$threadCount = 8;
$batchSize = $iterCount/$threadCount;
$block = {
Param($start,$range)
"start $start range $range"
For ($i = $start; $i -lt $range; $i++) {
$i
}
}
For ($i = 0; $i -lt $iterCount; $i += $batchSize) {
Start-Job -Scriptblock $block -ArgumentList $i,($i+$batchSize)
}
start 0 range 3
0
1
2
start 3 range 6
3
4
5
start 6 range 9
6
7
8
start 9 range 12
9
10
11
start 12 range 15
12
13
14
start 15 range 18
15
16
17
start 18 range 21
18
19
20
start 21 range 24
21
22
23
For purposes of comparison, actually creating the range ".." within the scriptblock. Output from receive-job may appear out of order.
$iterCount = 150000
$threadCount = 8
$batchSize = $iterCount/$threadCount
$block = {
Param($start,$range)
"start $start range $range"
Foreach ($i in $start..($range-1)) {
# $i
}
}
For ($i = 0; $i -lt $iterCount; $i += $batchSize) {
Start-Job -Scriptblock $block -ArgumentList $i,($i+$batchSize)
}
start 0 range 18750
start 18750 range 37500
start 75000 range 93750
start 131250 range 150000
start 93750 range 112500
start 37500 range 56250
start 56250 range 75000
start 112500 range 131250
Upvotes: 0
Reputation: 24071
Instead of evaluating a range operator in argument list, use temp variables to create an array of desired size. Then pass the array as an argument. Like so,
For ($i = 0; $i -lt 150000; $i += $batchSize) {
$j = $i+$batchSize
$range = $i..$j
Start-Job -Scriptblock $block -ArgumentList (,$range)
}
Edit: -ArgumentList
unravels the array, so a bit trickery is needed.
Test code via printing details about passed array:
$block = {
Param([array]$range)
write-host "len`t[0]`t[-1]"
write-host $range.length"`t"$range[0]"`t"$range[-1]
}
For ($i = 0; $i -lt 150000; $i += $batchSize) {
$j = $i+$batchSize
$range = $i..$j
Start-Job -Scriptblock $block -ArgumentList (,$range)
}
get-job | receive-job
len [0] [-1]
18751 0 18750
len [0] [-1]
18751 18750 37500
len [0] [-1]
18751 37500 56250
len [0] [-1]
18751 56250 75000
len [0] [-1]
18751 75000 93750
len [0] [-1]
18751 93750 112500
len [0] [-1]
18751 112500 131250
len [0] [-1]
18751 131250 150000
Upvotes: 1