Kaptcrunch
Kaptcrunch

Reputation: 5

Replacing HTML in Document with Regex not working

My script is reading in an HTML file scanning line by line for the matching regex to make the needed changes. For some reason when it reaches the first change it will not make the change but with testing it does drop into the if statement.

Below is both the PowerShell script and the file section that should be changed.

$sig_regex = [regex]::Escape('241')
$sig_regex2 = [regex]::Escape('West')
$replace_1 = "PO"
$replace_2 = "Box 4816  Syracuse, New York  13221"
$new_html = @()

Get-Content $Path | foreach {
    $_

    #This is the section that should be replacing the line
    if ($_ -like $sig_regex) {
        $new_html += ($_ -replace $sig_regex, $replace_1)
    }

    #Replace content in line 2 of the address section (West)
    if ($_ -match $sig_regex2) {
        $new_html += ($_ -replace $sig_regex2, $replace_2)
    } else {
        #Stores any content that should not be changed into the new file
        $new_html += $_
    }
}

$new_html | Set-Content "C:\Newhtml.htm"

HTML:

<p class=MsoNormal style='line-height:150%;text-autospace:none'><span
style='font-size:9.0pt;line-height:150%;font-family:TeXGyreAdventor color:#002C5B'>241
West<o:p></o:p></span></p>

Upvotes: 0

Views: 59

Answers (2)

HeedfulCrayon
HeedfulCrayon

Reputation: 857

You could try this... it uses the .net IO class. I would also just forget about regex for something this simple. If you were looking for something that changes from time to time, but still follows a formatting standard, that is when you should use a regex.

$sig_regex = '241'
$sig_regex2 = 'West'
$replace_1 = "PO"
$replace_2 = "Box 4816  Syracuse, New York  13221"
$new_html = @()

$file = [System.IO.File]::OpenText($Path)
while (!$file.EndOfStream) {
    $text = $file.ReadLine()
    if($text -match $sig_regex){
        $new_html += ($text -replace $sig_regex, $replace_1)
    }
    elseif ($text -match $sig_regex2) {
        $new_html += ($text -replace $sig_regex2, $replace_2)
    }
    else {
        $new_html += $text
    }
}

$new_html | Set-Content "C:\Newhtml.htm"

Upvotes: 0

Mike Shepard
Mike Shepard

Reputation: 18166

-Like is not a regular expression operator, but a "wildcard" operator (think * and ?).

You want to use -Match instead.

Upvotes: 1

Related Questions