Reputation: 874
Following the UTF-8 everywhere manifesto, and mostly its part: how to do text on Windows, I've created this simple example with wxWidgets. I wanted wxWidgets to interpret string literals as UTF-8 strings, but it seems, that library gets me wrong.
single source file - main.cpp
, encoded as UTF-8 without signature
(in msvc terminology):
#include <wx/wx.h>
class Mainw: public wxFrame
{
public:
Mainw(wxWindow * parent, wxWindowID wxId, const wxString & label)
: wxFrame(parent, wxId, label)
{
wxBoxSizer * sizer = new wxBoxSizer(wxHORIZONTAL);
sizer->Add(new wxTextCtrl(this, wxID_ANY, wxT("Кириллица")), 1, wxEXPAND | wxALL, 5);
this->SetSizer(sizer);
}
};
class MyApp: public wxApp
{
public:
bool OnInit()
{
Mainw *f = new Mainw(NULL, wxID_ANY, wxT("Frame"));
f->Show();
return true;
}
};
IMPLEMENT_APP(MyApp)
Preprocessor Definitions:
UNICODE
_UNICODE
WIN32
__WXMSW__
_WINDOWS
_DEBUG
__WXDEBUG__
wxUSE_UNICODE=1
WXUSINGDLL=1
Linked with WxWidgets Library version 3.0.2
Headers - http://sourceforge.net/projects/wxwindows/files/3.0.2/wxWidgets-3.0.2_headers.7z/download
Binaries - http://sourceforge.net/projects/wxwindows/files/3.0.2/binaries/wxMSW-3.0.2_vc90_Dev.7z/download
Being run, this example produces window with text Кириллица
, instead of Кириллица
(there was something similar, but it changed to this, when I tried to select it to copy here). It means, that wxWidgets fails to interpret my string literal as UTF-8, but interprets it as something else - maybe as text in system encoding which is windows-1251
.
Is there any way to change this behavior of library to match utf-8 everywhere manifesto?
I gave up. I managed to build library with msvc and flag wxUSE_UNICODE_UTF8
but it would not help without some complex changes in library configuration headers. It seems, that this option is POSIX only
Upvotes: 1
Views: 618
Reputation: 22688
Is there any way to change this behavior of library to match utf-8 everywhere manifesto?
No, not under Windows because Windows doesn't support UTF-8 locales (in principle, they could be emulated by the CRT, but AFAIK no compiler does it) and wxString(const char*)
ctor interprets the string in the current locale encoding by default.
There are two simple solutions however:
wxString::FromUTF8()
explicitly.wxString(const wchar_t*)
ctor with L"..."
wide char argument.Just for completeness, you also might force the library into accepting UTF-8 narrow text by rebuilding it with wxUSE_UTF8_LOCALE_ONLY=1
, but I'm far from sure if this is going to work because the CRT locale will still be different and so using non-ASCII characters with any CRT functions will most likely not work as expected, so I definitely do not recommend doing this unless you're just curious to see what happens.
Upvotes: 2