Reputation: 5255
I have this script that does some regex replace on file. What I don't understand is why the returned string has all its newline removed?
Sample file content (UTF-8, with CR-LF after each line):
hello
hello
hello
The script:
$content = Get-Content "c:\spikes\regexnewline\regexnewline.txt"
Set-Content "c:\spikes\regexnewline\regexnewline-2.txt" $content # test
$content = [regex]::Replace($content, "ll", "yy") #basic replace
Set-Content "c:\spikes\regexnewline\regexnewline-3.txt" $content
Of course, file regexnewline-2.txt
is an exact copy of the input file. But how come regexnewline-3.txt
has its content on one line only, with a single CR-LF at the end?
heyyo heyyo heyyo\CR\LF
Obviously I'm missing something here. Can anyone spot it?
BTW, I've tried to play with regex.Replace and use the overload with 4 arguments, specifying RegexOptions
, as documented on MSDN, but the script fails saying there's no 4-argument overload for this method. Is Powershell using a different version of the .Net framework?
Upvotes: 1
Views: 1049
Reputation: 201832
The reason you see this is because $content is an array of string when you originally read from the file. You can see the type of any variable like so:
$content.GetType().FullName
Get-Content by default returns an array of strings where each element represents a line. When you pass that array to .NET's regex replace method, PowerShell doesn't see a method overload that takes a string array but does see one that takes a string, so it coerces your string array into a string. You can see the same effect if you do this right after the Get-Content call:
"$content"
You can even modify how PowerShell concats the individual elements when it does this:
$OFS = ", "
"$content"
Rather than use .NET regex replace, try using PowerShell's -replace
operator which also handle regexs:
$content = $content -replace 'll','yy'
Upvotes: 8