Simon Elms
Simon Elms

Reputation: 19668

Does ForEach-Object operate on a single object in the pipeline or on a collection of objects?

I've had trouble grasping how the PowerShell pipeline works and I realise a lot of the problem is due to ForEach-Object. In other languages I've used, foreach operates on a collection, iterating through each element of the collection in turn. I assumed ForEach-Object, when used in a PowerShell pipeline, would do the same. However, everything I read about the pipeline suggests each element of a collection is passed through the pipeline separately and that downstream cmdlets are called repeatedly, operating on each element separately rather than on the collection as a whole.

So does ForEach-Object operate on a single element in the collection, rather than on the collection as a whole? Looking at it a different way, does the pipeline operator pass through the whole collection to ForEach-Object, which then iterates over it, or does the pipeline object iterate over the collection and pass each element separately to ForEach-Object?

Upvotes: 3

Views: 4020

Answers (3)

mklement0
mklement0

Reputation: 438073

The ForEach-Object cmdlet - unlike the foreach statement - itself performs no enumeration.

Instead, it operates on each item passed through the pipeline (with the option to also execute code before receiving the first and after receiving the last item, if any).

Therefore, it is arguably poorly named, given that it is the pipeline that provides the enumeration (by default), and that ForEach-Object simply invokes a script block for each item received.

The following examples illustrate this:

# Let the pipeline enumerate the elements of an array:
> 1, 2 | ForEach-Object { "item: [$_]; count: $($_.Count)" }
item: [1]; count: 1
item: [2]; count: 1

# Send the array *as a whole* through the pipeline (PSv4+)
> Write-Output -NoEnumerate 1, 2 | ForEach-Object { "item: [$_]; count: $($_.Count)" }
item: [1 2]; count: 2

Note that scripts / functions / cmdlets can choose whether a collection they write to the output stream (pipeline) should be enumerated or sent as a whole (as a single object).

In PowerShell code (scripts or functions, whether advanced (cmdlet-like) or not, enumeration is the default, but you can opt out with Write-Output -NoEnumerate; the -NoEnumerate switch was introduced in PSv4; prior to that, you had to use $PSCmdlet.WriteObject(), which is only available to advanced scripts / functions.

Also note that embedding a command in an expression by enclosing it in (...) forces enumeration:

# Send array as a whole.
> Write-Output -NoEnumerate 1, 2 | Measure-Object

Count: 1
...

# Converting the Write-Output -NoEnumerate command to an expression
# by enclosing it in in (...) forces enumeration
> (Write-Output -NoEnumerate 1, 2) | Measure-Object

Count: 2
...

Upvotes: 7

Mark Wragg
Mark Wragg

Reputation: 23355

ForEach-Object iterates through each item in a collection. When it is done performing it's scriptblock on the current item it is sent down the pipeline to the next command which can then immediately start processing it (while the ForEach-Object is dealing with the next item if there is one).

You can see this in action in the following example:

Get-Process | ForEach-Object { Start-Sleep 1; $_ } | Format-Table

The Get-Process cmdlet gets a list of processes and immediately sends each to ForEach-Object one at a time. The ForEach-Object is waiting 1 second and then outputting the current pipeline element $_. This is received by Format-Table which outputs it as a table. You can see it doesn't wait until all of the Processes are processed before outputting to the screen.

Upvotes: 3

alroc
alroc

Reputation: 28174

The answer is...kind of both.

A PowerShell function which supports pipelining (an advanced function) will process each item coming through the pipeline individually. It can also define a begin and end block which will be executed only once in a pipeline stage. In other words, the basic structure is this:

function Do-Stuff {
    begin {
         write-output "This will be done once, at the beginning"
    }
    process {
      Write-output "This will be done for each item"
    }
    end {
        Write-output "This will be done once, at the end"
    }
}

The output of 1..3 | foreach-Object {Do-Stuff $_} will be:

This will be done once, at the beginning
This will be done for each item
This will be done for each item
This will be done for each item
This will be done once, at the end

Because Do-Stuff is writing to the output stream, if there are additional pipeline stages after this Foreach-Object, each object output will be passed on to the next stage in turn. If there aren't any further stages or anything else to capture output, the output stream will be written to the console.

For example:

$verbosepreference = "continue";
[int]1..3|foreach-object {write-output $_; write-verbose ($_*-1)}|foreach-object {$_*$_;write-verbose $_} 

Gives the following output:

1
VERBOSE: 1
VERBOSE: -1
4
VERBOSE: 2
VERBOSE: -2
9
VERBOSE: 3
VERBOSE: -3

-X is output to the Verbose stream last (for each item) because the output was passed on to the next stage of the pipeline and processed before the next statement in the foreach-object scriptblock was executed.

Upvotes: 3

Related Questions