Henrik Heimbuerger
Henrik Heimbuerger

Reputation: 10213

Setting the source file character encoding for Mono's xbuild

I'm generating C# source code which is being built by both VS2010 and Mono's xbuild (2.10.2.0). This generally works very well, I've only had a single compatibility issue so far, and in that case I was using a 'feature' that is clearly specified as undefined behaviour (so mea culpa).

Now I'm running into an issue where I have special characters in a string literal in the C# source code. I'm generating the source files in UTF-8, the character I'm testing with is a German sharp s: 0xC39F. This is written to a file in latin1 by the code, where it ends up as 0xDF when the executable is built with VS (that's the one I want) and as 0xC33F when built with xbuild.

It does not seem to matter whether I run the executable with the .NET or with the Mono CLR, as far as I can see.

My current suspicion is that xbuild is not reading the source code as UTF-8, so the compiled code already has the wrong character in the string literal. Is there a way to explicitly tell it to? I couldn't find anything on xbuild /? and the xbuild documentation isn't particularly comprehensive. If I just missed the right page where this is documented, just a link is sufficient, of course.

All experiments have been performed on Win7 x64.

EDIT 1: To clarify, I've used a hex editor to confirm that the character in the source code file is really 0xC39F, the character written when compiled with VS2010 is 0xDF and the character written when compiled with xbuild is 0xC33F.

Upvotes: 1

Views: 2634

Answers (1)

jstedfast
jstedfast

Reputation: 38653

You'll need to modify the .csproj file(s) and add a <CodePage> element to the <PropertyGroup> section.

You should be able to use Visual Studio or MonoDevelop to do this for you, as well.

In MonoDevelop, if you right-click on a project and select the "Options" menu item, you can then go to the Build/General section and there will be a "Compiler Code Page" field which you can use to select "UTF-8".

FWIW, this is what MD outputs when I select UTF-8:

<CodePage>65001</CodePage>

So you can just copy/paste that into the <PropertyGroup>

Upvotes: 2

Related Questions