PauLEffect
PauLEffect

Reputation: 423

How to list UTF encoded filenames in a given directory in CPP?

I'm trying to get all the files in a given directory, under Windows 10, using a CMake based CPP project (VS compiler). I can't use boost or other libs. I'm using the following function

        string search_path = "D:\\*.*";
        WIN32_FIND_DATA fd;
        HANDLE hFind = ::FindFirstFile(search_path.c_str(), &fd);
        if(hFind != INVALID_HANDLE_VALUE)
        {
            do {
                // read all (real) files in current folder
                // , delete '!' read other 2 default folder . and ..
                if(! (fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) )
                {
                    printf("%s - ", fd.cFileName);
                    
                    for (int i = 0; i < 30; ++i)
                    {
                        printf("%02x", fd.cFileName[i]);
                    }
                    printf("\n");
                }
            } while(::FindNextFile(hFind, &fd));
            ::FindClose(hFind);
        }

It works fine for ASCII filenames, but an arabic file shows up as

???? ???? ?????.jpg - 3f3f3f3f203f3f3f3f203f3f3f3f3f2e6a706700746d6c0000696e646f77

I welcome any kind of pointer.

Upvotes: 0

Views: 237

Answers (2)

darune
darune

Reputation: 10982

If you have you could also try with standard library solution

for(auto& p: std::filesystem::directory_iterator("D:\\")) {
  std::wstring file_name = p.wstring();
}

Upvotes: 1

Marek R
Marek R

Reputation: 37927

Problem is encoding set up on your system. To make it work your system has to be configured to handle Arabic characters in single byte encoding. Windows do not use UTF-8. Check code page.

Other way is use wide characters API and wchar_t. In this case Windows uses UCS-2 or UTF-16 and it should work out of the box.

Also must read.

Upvotes: 3

Related Questions