Sriram
Sriram

Reputation: 10568

Using isalnum with signed character inputs - Visual C++

I have a very simple program where I am using the isalnum function to check if a string contains alpha-numeric characters. The code is:

#include "stdafx.h"
#include <iostream>
#include <string>
#include <locale>
using namespace std;

int _tmain(int argc, _TCHAR* argv[]) {

    string test = "(…….";

    for ( unsigned int i = 0; i < test.length(); i++) {     
            if (isalnum(test[i])) {
                cout << "True: " << test[i] << " " << (int)test[i] << endl;
            }
            else {
                cout << "False: " << isalnum(test[i]) << test[i] << " " << (int)test[i] << endl;
            }
    }

    return 0;
}

I am using Visual Studio Desktop Edition 2013 for this snippet. The issue(s):
1. When this program is run in Debug mode, the program fails with a debug assertion that says: "Expression c >= -1 && c <= 255"
Printing the character at the ith position results in a negative integer (-123). Converting all calls to isalnum to accept unsigned char as input causes the above error to disappear.

I checked the documentation for isalnum and it accepts arguments of type char. Then why does this code snippet fail? I am sure I am missing something trivial here but any help is welcome.

Upvotes: 3

Views: 2098

Answers (3)

Here is a solution, which seems to work https://en.cppreference.com/w/cpp/string/byte/isalnum

I tried to go along with Keith. Despite static_cast to unsigned char or passing locale() to the isalnum function as second parameter always ends up with a negative integer in the function/macro call to isalpha or isalnum (of the Visual C++ Debugger) and broke program execution.

My solution works in my context, which builds a string allowing for alphanumerics and ' ' (space character) only.

char, c;
string s {"somethings"};
if (c < 0)
  s += ' ';
else
  s += c;
return s;

Upvotes: 0

Keith Thompson
Keith Thompson

Reputation: 263617

The isalnum function is declared in <cctype> (the C++ version of <ctype.h>) -- which means you really should have #include <cctype> at the top of your source file. You're getting away with calling it without the #include directive because either "stdafx.h" or one of the standard headers (likely <locale>) includes it -- but it's a bad idea to depend on that.

isalnum and friends come from C. The isalnum function takes an argument of type int, which must be either within the range of unsigned char or equal to EOF (which is typically -1). If the argument has any other value, the behavior is undefined.

Annoyingly, this means that if plain char happens to be signed, passing a char value to isalnum causes undefined behavior if the value happens to be negative and not equal to EOF. The signedness of plain char is implementation-defined; it seems to be signed on most modern systems.

C++ adds a template function isalnum that takes an argument of any character type and a second argument of type std::locale. Its declaration is:

template <class charT> bool isalnum (charT c, const locale& loc);

I'm fairly sure that this version of isalnum doesn't suffer from the same problem as the one in <cctype>. You can pass it a char value and it will handle it correctly. You can also pass it an argument of some wide character type like wchar_t. But it requires two arguments. Since you're only passing one argument to isalnum(), you're not using this version; you're using the isalnum declared in <cctype>.

If you want to use this version, you can pass the default locale as the second argument:

std::isalnum(test[i], std::locale())

Or, if you're sure you're only working with narrow characters (type char), you can cast the argument to unsigned char:

std::isalnum(static_cast<unsigned char>(test[i]))

Upvotes: 4

Mark Ransom
Mark Ransom

Reputation: 308520

The problem is that characters are signed by default, and anything over 0x7f is being treated as a negative number when passed to isalnum. Make this simple change:

        if (isalnum((unsigned char)test[i])) {

Microsoft's documentation clearly states that the parameter is int, not char. I believe you're getting confused with a different version of isalnum that comes from the locale header. I don't know why the function doesn't accept sign-extended negative numbers, but suspect that it's based on wording in the standard.

Upvotes: 1

Related Questions