immobile2
immobile2

Reputation: 559

How to exit from ForEach-Object in PowerShell and continue remaining script

This is a duplicate of How to exit from ForEach-Object in PowerShell - I'm not sure how to mark that, but I'm asking the question again because there is a ton of helpful information in the answers to that question including ways to handle it. However, the accepted answer from 2012 doesn't seem correct and the nuggets of wisdom are buried and shouldn't be.

So my script is looping through all of the CSV files in a directory and then taking action on those that contain records with the wrong number of columns. When I first ran this, it took a long time and I realized that some of the CSVs actually have the correct number of columns for all rows and I didn't need to take action on them, so I decided to try and implement something to check for at least one row with the incorrect number of columns and then take some kind of action on the file. To do so, I needed to kick out of my ForEach-Object loop once I found a qualifying row.

Here is the original script:

$ParentPath = "C:\Users\<username>\Documents\Temporary\*"
$Files = Get-ChildItem -Path $ParentPath -Include *.csv
foreach ($File in $Files) {
    $OldPath = $File | % { $_.FullName }
    $OldContent = Get-Content $OldPath
    $NewContent = $OldContent |
    Where-Object {$_.split(",").Count -eq 11}
    Set-Content -Path "$(Split-Path $OldPath)\$($File.BaseName).txt" -Value $NewContent
    }

Implementing some type of 'pre-check' on the files proved difficult, even though I was able to use an answer from SO to properly write-out the 'bad lines' - I was unable to quit out of the ForEach-Object loop without processing all rows (thereby defeating the entire purpose of pre-checking the files for offending rows).

Code below works great at identifying the first bad row, and if you remove the ;break then it'll write out every offending row:

Get-Content $OldPath | ?{$_} | %{if(!($_.split(",").Count -eq 11)){"Process stopped at line number $($_.ReadCount), incorrect column count of: $($_.split(",").Count).";break}}

So how do I combine the two scripts to pre-check files for an offending row, and then take action on the files that need it? See proposed answer below!

Upvotes: 1

Views: 3015

Answers (3)

Paul Vergouwe
Paul Vergouwe

Reputation: 11

One can exit an foreach-object "loop" by the following construction:

$objects = "Brakes","Wheels","Windows"
$Break = $False
$objects | Where-Object { $Break -eq $False } | ForEach-Object {
 $Break = $_ -eq "Wheels";
 Write-Output "The car has $_.";
}

This is not invented by myself, but found on https://linuxhint.com/how-to-exit-from-foreach-object-in-powershell/

Upvotes: 0

mklement0
mklement0

Reputation: 437062

Here's a PowerShell-idiomatic reformulation of your solution (PSv4+) that should perform much better.

Note, however, that it assumes that each .csv file fits into memory as a whole:

Get-ChildItem -LiteralPath $HOME\Documents\Temporary -Filter *.csv | ForEach-Object {
  $okRows, $brokenRows = (Get-Content -ReadCount 0 -LiteralPath $_.FullName).
                           Where({ $_.split(",").Count -eq 11 }, 'Split')
  if ($brokenRows) {
    Set-Content -LiteralPath "$($_.DirectoryName)\$($_.BaseName).txt" -Value $okRows
  } 
}

To address the question implied by your question's title:

  • Unfortunately, as of PowerShell Core 7.2 there is still no direct way to prematurely stop a pipeline on demand:

    • Select-Object -First is capable of that, but it uses a non-public exception type, so the behavior is limited to exiting the pipeline after the first N input objects.

    • GitHub issue #3821 is a long-standing feature request to make this capability generally available.

  • While the break statement is not directly suitable for exiting a pipeline on demand - it looks for an enclosing loop statement to break out of, on the entire call stack, and, if it finds none, terminates execution overall - you can make it work with a dummy loop.

    • See this answer for more information.

    • Quick example:

      # Use of the dummy `do { ... } while ($false)` loop enables `break`
      # to exit the pipeline.
      do { 1..10 | ForEach-Object { $i=0 } { $_; if (++$i -eq 2) { break } } } while ($false)
      
  • The caveat with both Select-Object -First and break + dummy loop is that they skip the end block (for cmdlets written in PowerShell) / EndProcessing() method (for binary cmdlets) of all upstream cmdlets, which may cause problems.

Upvotes: 2

immobile2
immobile2

Reputation: 559

Tl;dr - here's my updated script:

$ParentPath = "C:\Users\<username>\Documents\Temporary\*"
$Files = Get-ChildItem -Path $ParentPath -Include *.csv
foreach ($File in $Files){
    $OldPath = $File | %{$_.FullName}
    Get-Content $OldPath | ?{!($_.split(",").Count -eq 11)} | Select -First 1 | %{[bool]$NeedsFix = 1}
    If($NeedsFix){Get-Content $OldPath | ?{($_.split(",").Count -eq 11)} | Set-Content -Path "$(Split-Path $OldPath)\$($File.BaseName).txt"}
    $NeedsFix = 0
}

To get here, let's summarize a few answers from the related question:

  • Stoffi does an incredible job of outlining the different outputs to expect from using break, continue, and return in a ForEach-Object loop and the ForEach method. Also see MS-ScriptingGuy - but this doesn't solve my conundrum
  • The first viable answer comes in the form of throwing and catching an exception. Those terms are currently foreign to me, but you might dig it!
  • Another viable option comes in the form of limiting your loops with a Where-Object, pretty slick! (See answers from ThePennyDrops, Rikki, Eddi Kumar if that floats your boat)
  • Alex Hague's solution I haven't tested, but using Labels with the break keyword seems like it might work for both ForEach-Object loops and ForEach loops
  • The pièce de résistance what if you just tell your pipeline to stop the ForEach-Object loop once it has selected the first instance you're looking for? Sound too simple? Maybe...but it seems to work!
    • Credit here goes to @zett42's incredibly simple solution that has far too few up votes, is far too easy to implement into the pipeline, and has the potential to be far more flexible than many of the other answers/solutions.
      • Need the check for 6 occurrences of problem rows before quitting the loop? How about using -First 6? Sure beats creating a counter variable

Upvotes: 1

Related Questions