arturvruffer
arturvruffer

Reputation: 21

Order of operations when adding dereferenced pointers

I'm given these lines of code in c and asked to explain its output:

char  str [50] = "hello \0 worl\bd";
printf("\n %s ",str);
printf("%s \n",str+str[4]-*str);

Output:

 hello   word

From my understanding the second line prints the string until the '\0' character which indicates the end. The third code then adds the rest of the string (from the space befor the 'w' skipping the 'l' as it is overwritten by the backspace.

What I don't understand is what exactly happens in the expression:

str+str[4]-*str

Doesn't the expression get executed like this:

  1. Add pointers to first element of char array: str+str
  2. Increment the resulting pointer by 4 and dereference it (Which should return a character?)
  3. Subtract the character that *char returns from the character from 2.?

It seems like thats not what happens. Can somebody explain to me what exactly happens here? Thanks alot!

Upvotes: 1

Views: 148

Answers (3)

Andrew Henle
Andrew Henle

Reputation: 1

Given

char  str [50] = "hello \0 worl\bd";

the code

str+str[4]-*str

evaluates to

str + 'o' - 'h'

because str[4] is the character 'o' and *str is the character 'h'.

Note that the result of adding (or subtracting) an integer-type value from a pointer is another pointer value.

But we don't know what the values of 'o' and 'h' are because the question doesn't specify the character set.

If we make the unstated assumption that the character set is ASCII, the character 'o' is the integer value 111, and 'h' has the integer value 104, so the value is

str + 111 - 104

(But see @dbush's answer as str + 111 - 104 is evaluated as ( str + 111 ) - 104 and str + 111 invokes undefined behavior as it's well past the end of the str array)

or

str + 7

so it's the address of the seventh character in str - &str[7]:

printf("\n %s ",str);

will emit

"\n hello "

without the quotes (the quotes are used to display the trailing space), and

printf("%s \n",str+str[4]-*str);

is

printf( "%s \n" str + 7 );

and will emit

" worl\bd \n"

again without the quotes. But the '\b' is the backspace character, so you'll likely see

" word \n"

printed on your screen - probably.

But there are assumptions in the question that prevent anyone from definitively stating what will be output:

  • the character set may not be ASCII, in which case the program likely invokes undefined behavior
  • the display media may not process the backspace character as expected

Without that information, the question can not be answered.

Precision matters in C. Questions like this that do not precisely specify the conditions and environment necessary to be able to answer the question are horrible and poorly thought out.

Upvotes: 2

dbush
dbush

Reputation: 224417

In this expression:

str+str[4]-*str

The array subscript operator [] has the highest precedence, followed by the dereference operator *, followed by + and - which have the same precedence and group left to right. So the above is the same as:

(str+(str[4]))-(*str)

So what happens is that the array str is converted to a pointer to its first element, then the character code for str[4] is added to that pointer, then the character code for *str (or equivalently str[0]) is subtracted from that.

Substituting in the characters in question, the above is the same as:

(str + 'o') - 'h'

Which, assuming ASCII encoding, is the same as:

(str + 111) - 104

But now we have a problem.

The subexpression str + 111 creates a pointer that is well past the end of the array, and doing so invokes undefined behavior, so your program is not well formed. It doesn't matter the the next operation would seem to give you a valid pointer. Just creating the pointer value str + 111 is invalid.

This is described in section 6.5.6p8 of the C standard regarding pointer addition/subtraction:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element o fan array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. ... If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

So you're "lucky" that the program is generating output that, from a quick look, would appear to be expected.

Had it been written like this:

str+(str[4]-*str)

The program would be well formed.

Upvotes: 4

Nicolae Natea
Nicolae Natea

Reputation: 1194

str[4]='o' // 111 in ascii
*str='h'   // 104 in ascii

str+str[4]-*str= str+111-104=&str[7] => ' worl\bd'

Upvotes: 3

Related Questions