Reputation: 71
Reduced problem code:
#include <iostream>
#include <string>
#include <vector>
#include <boost/locale.hpp>
std::string fold_case_nfkc(std::string_view str)
{
return boost::locale::normalize(boost::locale::fold_case(std::string(str)), boost::locale::norm_nfkc);
}
std::string normalize_nfkc(std::string_view str)
{
return boost::locale::normalize(std::string(str), boost::locale::norm_nfkc);
}
std::string fold_case_nfc(std::string_view str)
{
return boost::locale::normalize(boost::locale::fold_case(std::string(str)), boost::locale::norm_nfc);
}
std::string normalize_nfc(std::string_view str)
{
return boost::locale::normalize(std::string(str), boost::locale::norm_nfc);
}
bool same_text(std::string_view left_, std::string_view right_)
{
auto left{ fold_case_nfkc(left_) };
auto right{ fold_case_nfkc(right_) };
return left.compare(right) == 0;
}
int main()
{
auto lbm = boost::locale::localization_backend_manager::global();
auto s = lbm.get_all_backends();
std::for_each(s.begin(), s.end(), [](std::string& x){ std::cout << x << std::endl; });
lbm.select("icu");
boost::locale::localization_backend_manager::global(lbm);
boost::locale::generator g;
std::locale::global(g(""));
auto test = u8"#접시가숟가락으로도망쳤다";
std::cout << "input: " << test << std::endl;
std::cout << "fold_case_nfkc: " << fold_case_nfkc(test) << std::endl;
std::cout << "normalize_nfc: " << normalize_nfc(test) << std::endl;
return 0;
}
The expected output is:
backends: icu posix std
input: #접시가숟가락으로도망쳤다
fold_case_nfkc: #접시가숟가락으로도망쳤다
normalize_nfc: #접시가숟가락으로도망쳤다
The output I actually get, if icu
is the locale backend:
rmorales2005@tillie:~ % clang++ -o test locale_test.cpp -std=c++17 -I/usr/local/include -L/usr/local/lib -lboost_locale
rmorales2005@tillie:~ % ./test
backends: icu posix std
input: #접시가숟가락으로도망쳤다
fold_case_nfkc: #
normalize_nfc: #
(This system is FreeBSD 12.1, with clang version 8.0.1; boost-libs
installed via ports)
If I use posix
, or run the program on Windows, I get the expected output. But for posix
this is only because it doesn't even support normalization.
How do I get this code to work with icu
as the backend?
Upvotes: 1
Views: 251
Reputation: 393134
It could be your terminal emulation playing tricks. On my Linux box, I get similar output when running from inside Vim, but running it through e.g. od
or xxd
shows that the bytes are there, and when redirecting to a file, it shows up correctly in an editor.
Note though that on my c++20 compiler, the
char8_t const*
string (fromu8""
) cannot be streamed tostd::cout
, so I changed it to a regular""
literal after making sure that my source file is utf-8 encoded.See the message: https://wandbox.org/permlink/BQnIsAzXMQVkE3Zn (compare with your compiler version of flags)
Here's a demonstruction:
For all backends
For all backends
For all backends
Upvotes: 1