Martin
Martin

Reputation: 65

Any efficiency difference between using pipeline and directly using InputObject parameter?

Here is an example, I have a remote server named "s1" and I want to kill the processes of calc, notepad, winword in it.

I may have two ways to do this,

  1. Using pipeline

Get-Process -computername s1 -name "calc", "notepad", "winword" | Stop-Process

  1. Using InputObject parameter

$processes = Get-Process -computername s1 -name "calc", "notepad", "winword" Stop-Process -InputObject $processes

IMO, I think the second way is much better than the first one.

It's said the pipeline in PowerShell actually passes object one by one to the following Cmdlet. In this case, Stop-Process needs to communicate with the remote computer "s1" many times and kills those processes one by one.

By contrast, with the second way, I guess Stop-Process will just communicate the remote computer "s1" once and complete the operation at one shot.

Do I understand this correctly?

Thanks

Martin

Upvotes: 4

Views: 579

Answers (1)

mklement0
mklement0

Reputation: 439058

As for the remoting aspect:

You should choose a different approach altogether: run entire pipeline remotely, using Invoke-Command:

Invoke-Command -ComputerName s1 { 
  Get-Process -Name 'calc', 'notepad', 'winword' | Stop-Process
}

Note, however, that Invoke-Command (PSv3+) requires PowerShell remoting to be configured on the target machine (see Get-Help about_Remote_FAQ), whereas the Get-Process cmdlet uses a different, obsolescent form of remoting.

In fact, when I tried your approach between two v5.1 machines, the locally run Stop-Process command that tried to operate on remote process objects failed, with the following error:

Cannot stop process "<name>" because of the following error:
Feature is not supported for remote machines.

Generally, the best approach is to perform as much processing as possible remotely and only transfer the results to the local machine.


As for the more general aspect of pipeline input vs. input via -InputObject:

Use of -InputObject for passing a collection (Get-Foo -InputObject $collection) is definitely faster than sending that collection through the pipeline ($collection | Get-Foo), but note that it requires the entire input collection to be loaded into memory as a whole, up front, which potentially negates a key benefit of the pipeline: memory throttling.

Note that use of -InputObject is often not a viable alternative to pipeline input, because many cmdlets do not enumerate collections you pass to -InputObject (compare 1, 2 | ForEach-Object { "[$_]" } to ForEach-Object { "[$_]" } -InputObject 1, 2); typically, this happens accidentally, with -InputObject declarations that aren't arrays, but sometimes it is by design: the Get-Member cmdlet purposely doesn't enumerate a collection passed to -InputObject, because it then inspects the collection's type, not that of its elements. See this GitHub issue for background information.

Also note that if there are further pipeline segments (Get-Foo -InputObject ... | ...), then streaming (one-by-one processing) again happens on output.

You can speed up item-by-item processing by using a foreach statement (foreach ($elem in $collection) { ... }) in lieu of the pipeline, but that is only effective if you can avoid cmdlet calls in the loop body.
As with -InputObject, however, this requires the entire input collection to be loaded into memory as a whole, up front


A performance comparison with 10,000 input objects, averaged across 100 runs, using Time-Command:

$collection = 1..10000
Time-Command -Count 100 { $collection | Write-Output }, 
                        { Write-Output -InputObject $collection },
                        { foreach ($o in $collection) { $o } }

Sample timings (Windows PowerShell 5.1 on Windows 10, single-core VM):

Command                               Secs (100-run avg.) TimeSpan         Factor
-------                               ------------------- --------         ------
foreach ($o in $collection) { $o }    0.010               00:00:00.0103421 1.00
Write-Output -InputObject $collection 0.015               00:00:00.0152200 1.47
$collection | Write-Output            0.108               00:00:00.1076183 10.41
  • foreach is fastest (note how implicit output is relied on in the loop body: if a Write-Output call had been used, this would have been by far the slowest solution), with -InputObject performance not far behind; interestingly, the roles seem to be reversed in PowerShell Core.
  • Providing input via the pipeline was about 10 times slower.

However, note that the factors vary with the size of the input collection (and, possibly, your hardware).

Upvotes: 3

Related Questions