Reputation: 20306

printf, wprintf, %s, %S, %ls, char* and wchar*: Errors not announced by a compiler warning?

I have tried the following code:

wprintf(L"1 %s\n","some string"); //Good
wprintf(L"2 %s\n",L"some string"); //Not good -> print only first character of the string
printf("3 %s\n","some string"); //Good
//printf("4 %s\n",L"some string"); //Doesn't compile
printf("\n");
wprintf(L"1 %S\n","some string"); //Not good -> print some funny stuff
wprintf(L"2 %S\n",L"some string"); //Good
//printf("3 %S\n","some string"); //Doesn't compile
printf("4 %S\n",L"some string");  //Good

And I get the following output:

1 some string
2 s
3 some string

1 g1 %s

2 some string
4 some string

So: it seems that both wprintf and printf are able to print correctly both a char* and a wchar*, but only if the exact specifier is used. If the wrong specifier is used, you might not get a compiling error (nor warning!) and end up with wrong behavior. Do you experience the same behaviour?

Note: This was tested under Windows, compiled with MinGW and g++ 4.7.2 (I will check gcc later)

Edit: I also tried %ls (result is in the comments)

printf("\n");
wprintf(L"1 %ls\n","some string"); //Not good -> print funny stuff
wprintf(L"2 %ls\n",L"some string"); //Good
// printf("3 %ls\n","some string"); //Doesn't compile
printf("4 %ls\n",L"some string");  //Good

Upvotes: 35

Answers (8)

transience_iszquasia

Reputation: 1

David Foerster's answer was very instructive. Thanks for answer. Don't have rep to comment apparently (think I forgot my original account). But wanted to add the locale chosen must exist (and the given string must be valid within that locale.

E.g. I do not have en_US.UTF-8, so my output was content-type:text/html; charset:utf-8 ?EURo Dikaiopolis en agro estin


?EURo Dikaiopolis en agro estin

4 5 6

When I changed it to a proper locale for the system, I got the expected:


content-type:text/html; charset:utf-8
🕽€ο Δικαιοπολις εν αγρω εστιν
🕽€ο Δικαιοπολις εν αγρω εστιν

4 5 6

Upvotes: 0

71GA

Reputation: 1401

Answer A

None of the answers above pointed out why you might not see some of your prints. This is also because here you are dealing with streams (I didn't know this) and stream has something called orientation. Let me cite something from this source:

Narrow and wide orientation

A newly opened stream has no orientation. The first call to any I/O function establishes the orientation.

A wide I/O function makes the stream wide-oriented, a narrow I/O function makes the stream narrow-oriented. Once set, the orientation can only be changed with freopen.

Narrow I/O functions cannot be called on a wide-oriented stream; wide I/O functions cannot be called on a narrow-oriented stream. Wide I/O functions convert between wide and multibyte characters as if by calling mbrtowc and wcrtomb. Unlike the multibyte character strings that are valid in a program, multibyte character sequences in the file may contain embedded nulls and do not have to begin or end in the initial shift state.

So once you use printf() your orientation becomes narrow and from this point on you can't get anything out of wprintf() and you really don't. Unless you use freopen() which is intended to be used on files.

Answer B

As it turns out you can use freopen() like this:

freopen(NULL, "w", stdout);

To make stream "not defined" again. Try this example:

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main(void)
{
    // We set locale which is the same as the environmental variable "LANG=en_US.UTF-8".
    setlocale(LC_ALL, "en_US.UTF-8");

    // We define array of wide characters. We indicate this on both sides of equal sign
    // with "wchar_t" on the left and "L" on the right.
    wchar_t y[100] = L"🕽€ο Δικαιοπολις εν αγρω εστιν\n";

    // We print header in ASCII characters
    wprintf(L"content-type:text/html; charset:utf-8\n\n");

    // A newly opened stream has no orientation. The first call to any I/O function
    // establishes the orientation: a wide I/O function makes the stream wide-oriented,
    // a narrow I/O function makes the stream narrow-oriented. Once set, we must respect
    // this, so for the time being we are stuck with either printf() or wprintf().

    wprintf(L"%S\n", y);    // Conversion specifier %S is not standardized (!)
    wprintf(L"%ls\n", y);   // Conversion specifier %s with length modifier %l is 
                            // standardized (!)

    // At this point current orientation of the stream is wide and this is why following
    // narrow function won't print anything! Whether we should use wprintf() or printf()
    // is primarily a question of how we want output to be encoded.

    printf("1\n");          // Print narrow string of characters with a narrow function
    printf("%s\n", "2");    // Print narrow string of characters with a narrow function
    printf("%ls\n",L"3");   // Print wide string of characters with a narrow function

    // Now we reset the stream to no orientation.
    freopen(NULL, "w", stdout);

    printf("4\n");          // Print narrow string of characters with a narrow function
    printf("%s\n", "5");    // Print narrow string of characters with a narrow function
    printf("%ls\n",L"6");   // Print wide string of characters with a narrow function

    return 0;
}

Upvotes: 1

Mats Petersson

Reputation: 129474

The format specifers matter: %s says that the next string is a narrow string ("ascii" and typically 8 bits per character). %S means wide char string. Mixing the two will give "undefined behaviour", which includes printing garbage, just one character or nothing.

One character is printed because wide chars are, for example, 16 bits wide, and the first byte is non-zero, followed by a zero byte -> end of string in narrow strings. This depends on byte-order, in a "big endian" machine, you'd get no string at all, because the first byte is zero, and the next byte contains a non-zero value.

Upvotes: 23

Hypoano

Reputation: 21

From man fprintf

   C      (Not in C99 or C11, but in SUSv2, SUSv3, and SUSv4.)  Synonym for lc.  Don't use.

   S      (Not in C99 or C11, but in SUSv2, SUSv3, and SUSv4.)  Synonym for ls.  Don't use.

Thus, don't use %C or %S, use always %lc or %ls instead.

Upvotes: 2

user3581075

Reputation: 71

For s: When used with printf functions, specifies a single-byte or multi-byte character string; when used with wprintf functions, specifies a wide-character string. Characters are displayed up to the first null character or until the precision value is reached.

For S: When used with printf functions, specifies a wide-character string; when used with wprintf functions, specifies a single-byte or multi-byte character string. Characters are displayed up to the first null character or until the precision value is reached.

In Unix-like platform, s and S have the same meaning as windows platform.

Reference: https://msdn.microsoft.com/en-us/library/hf4y5e3w.aspx

Upvotes: 6

David Foerster

Reputation: 1540

%S seems to conform to The Single Unix Specification v2 and is also part of the current (2008) POSIX specification.

Equivalent C99 conforming format specifiers would be %s and %ls.

Upvotes: 2

Steve R

Reputation: 51

At least in Visual C++: printf (and other ACSII functions): %s represents an ASCII string %S is a Unicode string wprintf (and other Unicode functions): %s is a Unicode string %S is an ASCII string

As far as no compiler warnings, printf uses a variable argument list, with only the first argument able to be type checked. The compiler is not designed to parse the format string and type check the parameters that match. In cases of functions like printf, that is up to the programmer

Upvotes: 5

R.. GitHub STOP HELPING ICE

Reputation: 215457

I suspect GCC (mingw) has custom code to disable the checks for the wide printf functions on Windows. This is because Microsoft's own implementation (MSVCRT) is badly wrong and has %s and %ls backwards for the wide printf functions; since GCC can't be sure whether you will be linking with MS's broken implementation or some corrected one, the least-obtrusive thing it can do is just shut off the warning.

Upvotes: 28

printf, wprintf, %s, %S, %ls, char* and wchar*: Errors not announced by a compiler warning?

Answers (8)

Answer A

Narrow and wide orientation

Answer B

Related Questions