whtthps
whtthps

Reputation: 41

Powershell magnling ascii text

I'm getting extra characters and lines when trying to modify hosts files. For example, this select string does not take anything out, but the two files are different:

get-content -Encoding ascii C:\Windows\system32\drivers\etc\hosts |
  select-string -Encoding ascii -notmatch "thereisnolinelikethis" |
  out-file -Encoding ascii c:\temp\testfile

PS C:\temp> (get-filehash C:\windows\system32\drivers\etc\hosts).hash
C54C246D2941F02083B85CE2774D271BD574F905BABE030CC1BB41A479A9420E

PS C:\temp> (Get-FileHash C:\temp\testfile).hash
AC6A1134C0892AD3C5530E58759A09C73D8E0E818EC867C9203B9B54E4B83566

Upvotes: 4

Views: 722

Answers (3)

Keith Hill
Keith Hill

Reputation: 201612

I think this is more of an issue with PowerShell's F&O (formatting & output) engine. Keep in mind that Select-String outputs a rich object called MatchInfo. When that object reaches the end of the output it needs to be rendered to a string. I think it is that rendering/formatting that injects the extra line. One of the properties on MatchInfo is the line that was matched (or notmatched). If you pass just the Line property down the pipe, it seems to work better (hashes match):

Get-Content C:\Windows\system32\drivers\etc\hosts |
    Select-String -notmatch "thereisnolinelikethis" |
    Foreach {$_.Line} |
    Out-File -Encoding ascii c:\temp\testfile

BTW you only need to specify ASCII encoding when outputting back to the file. Everywhere else in PowerShell, just let the string flow as Unicode.

All that said, I would use Where-Object instead of Select-String for this scenario. Where-Object is a filtering command which is what you want. Select-String takes input of one form (string) and converts it to a different object (MatchInfo).

Upvotes: 2

Mathias R. Jessen
Mathias R. Jessen

Reputation: 174445

Out-File adds a trailing NewLine ("`r`n") to the testfile file.

C:\Windows\System32\drivers\etc\hosts does not contain a trailing newline out of the box, which is why you get a different FileHash


If you open the files with a StreamReader, you'll see that the underlying stream differs in length (due to the trailing newline in the new file):

PS C:\> $Hosts = [System.IO.StreamReader]"C:\Windows\System32\drivers\etc\hosts"
PS C:\> $Tests = [System.IO.StreamReader]"C:\temp\testfile"
PS C:\> $Hosts.BaseStream.Length
822
PS C:\> $Tests.BaseStream.Length
824
PS C:\> $Tests.BaseStream.Position = 822; $Tests.Read(); $Tests.Read()
13
10

ASCII characters 13 (0x0D) and 10 (0x0A) correspond to [System.Environment]::NewLine or CR+LF

Upvotes: 0

Ashigore
Ashigore

Reputation: 4678

I can confirm that your commands do inexplicably result in extra line breaks in the output file, in the start and in the end. Powershell also converts the tabs in the original file into four spaces instead.

While I cannot explain why, these commands do the same thing without these issues:

Try this code instead:

Get-Content -Path C:\Windows\System32\drivers\etc\hosts -Encoding Ascii | 
  Where-Object { -not $_.Contains("thereisnolinelikethis")  } |
  Out-File -FilePath "c:\temp\testfile" -Encoding Ascii

Upvotes: 2

Related Questions