Reputation: 21749
I'm on Windows and I'm constructing std::filesystem::path
from std::string
. According to constructor reference (emphasis mine):
If the source character type is
char
, the encoding of the source is assumed to be the native narrow encoding (so no conversion takes place on POSIX systems)
If I understand correctly, this means string content will be treated as encoded in ANSI under Windows. To treat it as encoded in UTF-8, I need to use std::filesystem::u8path()
function. See the demo: http://rextester.com/PXRH65151
I want constructor of path
to treat contents of narrow string as UTF-8 encoded. For boost::filesystem::path
I could use imbue()
method to do this:
boost::filesystem::path::imbue(std::locale(std::locale(), new std::codecvt_utf8_utf16<wchar_t>()));
However, I do not see such method in std::filesystem::path
. Is there a way to achieve this behavior for std::filesystem::path
? Or do I need to spit u8path
all over the place?
Upvotes: 5
Views: 2086
Reputation: 473232
For the sake of performance, path
does not have a global way to define locale conversions. Since C++ pre-20 does not have a specific type for UTF-8 strings, the system assumes any char
strings are narrow character strings. So if you want to use UTF-8 strings, you have to spell it out explicitly, either by providing an appropriate conversion locale to the constructor or by using u8path
.
C++20 gave us char8_t
, which is always presumed to be UTF-8. So if you consistently use char8_t
-based strings (like std::u8string
), path
's implicit conversion will pick up on it and work appropriately.
Upvotes: 0
Reputation: 15207
My solution to this problem is to fully alias the std::filesystem
to a different namespace named std::u8filesystem
with classes and methods that treat std::string
as UTF-8 encoded. Classes inherit their corresponding in std::filesystem
with same name, without adding any field or virtual method to offer full API/ABI interoperability. Full proof of concept code here, tested only on Windows so far and far to be complete. The following snippet shows the core working of the helper:
std::wstring U8ToW(const std::string &string);
namespace std
{
namespace u8filesystem
{
#ifdef WIN32
class path : public filesystem::path
{
public:
path(const std::string &string)
: fs::path(U8ToW(path))
{
}
inline std::string string() const
{
return filesystem::path::u8string();
}
}
#else
using namespace filesystem;
#endif
}
}
Upvotes: 2