Antoine
Antoine

Reputation: 5255

Why does powershell Regex.Replace swallows newline?

I have this script that does some regex replace on file. What I don't understand is why the returned string has all its newline removed?

Sample file content (UTF-8, with CR-LF after each line):

hello
hello
hello

The script:

$content = Get-Content "c:\spikes\regexnewline\regexnewline.txt"
Set-Content "c:\spikes\regexnewline\regexnewline-2.txt" $content # test

$content = [regex]::Replace($content, "ll", "yy") #basic replace

Set-Content "c:\spikes\regexnewline\regexnewline-3.txt" $content

Of course, file regexnewline-2.txt is an exact copy of the input file. But how come regexnewline-3.txt has its content on one line only, with a single CR-LF at the end?

heyyo heyyo heyyo\CR\LF

Obviously I'm missing something here. Can anyone spot it?

BTW, I've tried to play with regex.Replace and use the overload with 4 arguments, specifying RegexOptions, as documented on MSDN, but the script fails saying there's no 4-argument overload for this method. Is Powershell using a different version of the .Net framework?

Upvotes: 1

Views: 1049

Answers (1)

Keith Hill
Keith Hill

Reputation: 201832

The reason you see this is because $content is an array of string when you originally read from the file. You can see the type of any variable like so:

$content.GetType().FullName

Get-Content by default returns an array of strings where each element represents a line. When you pass that array to .NET's regex replace method, PowerShell doesn't see a method overload that takes a string array but does see one that takes a string, so it coerces your string array into a string. You can see the same effect if you do this right after the Get-Content call:

"$content"

You can even modify how PowerShell concats the individual elements when it does this:

$OFS = ", "
"$content"

Rather than use .NET regex replace, try using PowerShell's -replace operator which also handle regexs:

$content = $content -replace 'll','yy'

Upvotes: 8

Related Questions