Hinton
Hinton

Reputation: 91

Remove blank lines after specific text (without using -notmatch)

We have a script that uses a function to go through a text file and replace certain words with either other words or with nothing. The spots that get replaced with nothing leave behind a blank line, which we need to remove in some cases (but not all). I've seen several places where people mention using things like -notmatch to copy over everything to a new file except what you want left behind, but there are a lot of blank lines we want left in place.

For example:

StrangerThings: A Netflix show
'blank line to be removed'
'blank line to be removed'
Cast: Some actors
Crew: hard-working people
'blank line left in place'
KeyGrip
'blank line to be removed'
Gaffer
'blank line left in place'

So that it comes out like this:

StrangerThings: A Netflix show
Cast: Some actors
Crew: hard-working people

KeyGrip
Gaffer

We've tried doing a -replace, but that doesn't get rid of the blank line. Additionally, we have to key off of the text to the left of the ":" in each line. The data to the right in most cases is dynamic, so we can't hard-code anything in for that.

function FormatData {

    #FUNCTION FORMATS DATA BASED ON SECTIONS

    #This is where we're replacing some words in the different sections
    #Some of these we replace leave the blank lines behind
        $data[$section[0]..$section[1]] -replace $oldword,$newword

    $output | Set-Content $outputFile
}

$oldword = "oldword"
$newword = "newword"
FormatData

$oldword = "oldword1"
$newword = "" #leaves a blank line
FormatData

$oldword = "Some phrase: "
$newword = "" #leaves a blank line
FormatData

We just need a pointer in the right direction on how to delete/remove a blank line (or several lines) after specific text, please.

Upvotes: 0

Views: 446

Answers (2)

AdminOfThings
AdminOfThings

Reputation: 25001

Since it looks like you are reading in an array and doing replacements, the array index will not go away. You can change the value to blank or white space, and it will still appear as a blank line when it is output to a file or console. Using the -replace operator with no replacement string, replaces the regex match with an empty string.

One approach could be to read the data in raw like Get-Content -Raw and then the text is read into memory as is, but you lose array indexing. At that point, you have full control over replacing newline characters if you choose to do so. A second approach would be to mark the blank lines you want to keep initially (<#####> in this example), do the replacements, remove the blank spaces, and then clean up the markings.

# Do this before any new word replacements happen. Pass this object into any functions.
$data = $data -replace "^\s*$","<#####>"

$data[$section[0]..$section[1]] -replace $oldword,$newword

($output | Where-Object {$_}) -replace "<#####>" | Set-Content $outputFile

Explanation:

Any value that is white space, blank, or null will evaluate to false in a PowerShell boolean conditional statement. Since the Where-Object script block performs a boolean conditional evaluation, you can simply just check the pipeline object ($_). Any value (in this case a line) that is not white space, null, or empty, will be true.


Below is a trivial example of the behavior:

$arr = "one","two","three"
$arr
one
two
three

$arr -replace "two"
one

three

$arr[1] = "two"
$arr
one
two
three

$arr -replace "two" | Where-Object {$_}
one
three

You can set a particular array value to $null and have it appear to go away. When writing to a file, it will appear as if the line has been removed. However, the array will still have that $null value. So you have to be careful.

$arr[1] = $null
$arr
one
three
$arr.count
3

If you use another collection type that supports resizing, you have the Remove method available. At that point though, you are adding extra logic to handle index removals and can't be enumerating the collection while you are changing its size.

Upvotes: 1

Ben Personick
Ben Personick

Reputation: 3264

If all you are doing is parsing a text file:

function FormatData {
   $Input -replace $oldword,$newword
}
$FileContent = Get-Content "C:\TextFile.txt"
$OutputFile = "C:\TextOutput.txt"

$oldword = "oldword"
$newword = "newword"
$FileContent = $FileContent | FormatData

$oldword = '^(Crew: hard-working people)([`r`n]+).*oldword1.*[`r`n]+'
$newword = '$1$2$2' # Leaves a blank Line after Crew: hard-working people
$FileContent = $FileContent | FormatData



$oldword = '^.*oldword1.*[`r`n]+'
$newword = '' # Does not leave a blank Line
$FileContent = $FileContent | FormatData

$FileContent | Set-Content $outputFile

Upvotes: 0

Related Questions