Reputation: 11
I'm currently using VS code to write a PowerShell script. As part of this script REGEX is used to replace/remove an atypical character that ends up in the data fairly often and causes trouble down the line. The character is (U+2019) and when the script is opened in code it is replaced permanently with (U+FFFD)
thus the line:
$user.Name = $user.Name -Replace "'|\’|\(|\)|\s+",""
Permanently becomes: $user.Name = $user.Name -Replace "'|\�|\(|\)|\s+",""
until it is manually changed. Seeing as I can paste the U+2019 character in once the file is open and then run the code, I assume that VS code can interpret it okay and the problem is with loading the file in. Is there some option that I can set to stop this being replaced when I open the file?
Upvotes: 1
Views: 8359
Reputation: 101
In my case, turning on the VS Code setting, "Files: Auto Guess Encoding," has fixed the problem, both for reading and saving.
Upvotes: 3
Reputation: 27433
If I save in Vscode as Windows 1252 encoding, I see the character "’"
change to �
on next opening. I think the problem is Vscode doesn't recognize Windows 1252. It opens it as UTF8. If you reopen with the Windows 1252 encoding, it displays correctly. The other encodings work fine, even to display the character. This includes utf8 no bom.
Even Powershell 5 doesn't have this problem with Windows 1252, only Vscode. Set-content and get-content in Powershell 5 default to Windows 1252.
"’" | set-content file
get-content file
’
Powershell 7 would actually have the same problem:
get-content file
�
Upvotes: 0
Reputation: 13453
This looks like it all comes down to encoding. Visual Studio Code by default uses UTF-8 and can in general handle saving/viewing Unicode properly.
If the issue is on Opening the file, then is is a case where Visual Studio Code is misinterpreting the file encoding on Opening the file. You can change the encoding (Configuring VS Code encoding) via settings in VS Code for file specific encoding (e.g. UTF-8, UTF-8BOM, UTF-16LE,etc.) by changing the "files.encoding"
setting.
"files.encoding": "utf8bom"
If the issue is on saving the file, then it is being saved as ASCII(aka. Windows-1252) and not as proper UTF-8 or equivalent. On save, the character is replaced with the Replacement Character (U+FFFD) which would be displayed on the next time it is opened.
Note: The default encoding used for Windows PowerShell v5.1 is Windows-1252, and may be why saving the scripts with special characters may not work. PowerShell Core v6+ uses UTF-8 by default.
Upvotes: 3