FrankS101
FrankS101

Reputation: 2135

How to find out if there is any non ASCII character in a string with a file path

Detect if there is any non-ASCII character in a file path

I have a Unicode string with UTF-8 encoding that stores the file path, like, for instance, C:\Users\myUser\Downloads\ü.pdf. I have already checked that the string holds a correct file path in the local file system, but since I'm sending this string to a different process that supports only ASCII I need to figure out if that string contains any non-ASCII character.

How can I do that?

Upvotes: 8

Views: 15835

Answers (2)

FrankS101
FrankS101

Reputation: 2135

As suggested by several comments and highlighted by @CrisLuengo answer, we can iterate the characters looking for any in the upper bit set (live example):

#include <iostream>
#include <string>
#include <algorithm>

bool isASCII (const std::string& s)
{
    return !std::any_of(s.begin(), s.end(), [](char c) { 
        return static_cast<unsigned char>(c) > 127; 
    });
}

int main()
{
    std::string s1 { "C:\\Users\\myUser\\Downloads\\Hello my friend.pdf" };   
    std::string s2 { "C:\\Users\\myUser\\Downloads\\ü.pdf" };

    std::cout << std::boolalpha << isASCII(s1) << "\n";
    std::cout << std::boolalpha << isASCII(s2) << "\n";
}

true

false

Upvotes: 9

Cris Luengo
Cris Luengo

Reputation: 60444

An ASCII character uses only the lower 7 bits of a char (values 0-127). A non-ASCII Unicode character encoded in UTF-8 uses char elements that all have the upper bit set. So, you can simply iterate the char elements seeing if any of them has a value above 127, eg:

bool containsOnlyASCII(const std::string& filePath) {
  for (auto c: filePath) {
    if (static_cast<unsigned char>(c) > 127) {
      return false;
    }
  }
  return true;
}

A note on the cast: std::string contains char elements. The standard doesn't define whether char is signed or unsigned. If it's signed, then we can cast it to unsigned in a well-defined way. The standard specifies exactly how this is done.

Upvotes: 9

Related Questions