Reputation: 6863
I have a .Net Core (C#) project with the following line in one of the classes:
var input = "£";
But when I do a git clone in a Docker container (microsoft/dotnet:2.2-sdk
) it messes it up and displays it as �
(in bash
using cat
).
And when I run it, its Utf-8
bytes are [239, 191, 189] = [EF, BF, BD]
which seem to be a so-called Unicode replacement character.
Windows editor that I use is VS 2017, but character is displayed properly on other windows machines and parsed properly by dotnet run/test
command, so I don't think this is a problem of failing to save the character incorrectly.
Any ideas why I am seeing such a mess and how to solve it?
Some details
Encoding.UTF8.GetBytes("£");
Windows 10
machineDebian GNU/Linux 9 (stretch)
from the cat /etc/os-release
locale -a
returns C
C.UTF-8
POSIX
Running fgrep 'var input' file.cs | od -tx1 -c
0000100 76 61 72 20 69 6e 70 75 74 20 3d 20 22 a3 22 3b
v a r i n p u t = " 243 " ;
Upvotes: 0
Views: 314
Reputation: 9895
Your file contains a single byte a3
which corresponds to the Windows-1252 encoding for the character £
. Your Linux system displays �
because it is not a valid UTF-8 encoding.
You should configure Visual Studio to use UTF-8 instead of Windows-1252.
Upvotes: 1