Dakota West
Dakota West

Reputation: 461

I cannot understand this behavior of struct pointers and XOR

I'm working with struct pointers for the first time, and I can't seem to make sense of what's happening here. My test applies the basic property of xor that says x ^ y ^ y = x, but not in C?

The below code is in my main program, and accurately restores all of the letters of "test" (which I proceed to print on screen, but I've omitted a lot of junk so as to keep this question short(er)). The struct "aes" refers to this definition:

typedef uint32_t word;

struct aes {

word iv[4];
word key[8];
word state[4];
word schedule[56];

};

As the context might suggest, the encapsulating project is an AES implementation (I'm trying to speed up my current one by trying new techniques).

In my testing, make_string and make_state work reliably, even in the functions in question, but for references sake:

void make_string (word in[], char out[]) {

for (int i = 0; i < 4; i++) {

    out[(i * 4) + 0] = (char) (in[i] >> 24);
    out[(i * 4) + 1] = (char) (in[i] >> 16);
    out[(i * 4) + 2] = (char) (in[i] >>  8);
    out[(i * 4) + 3] = (char) (in[i]      );

}

}

void make_state(word out[], char in[]) {

for (int i = 0; i < 4; i++) {

    out[i] =    (word) (in[(i * 4) + 0] << 24) ^
                (word) (in[(i * 4) + 1] << 16) ^
                (word) (in[(i * 4) + 2] <<  8) ^
                (word) (in[(i * 4) + 3]      );

}

}

Anyway, here is the block that DOES work. It's this functionality that I'm trying to modularize by stowing it away in a function:

char test[16] = {
    'a', 'b', 'c', 'd',
    'e', 'f', 'g', 'h',
    'i', 'j', 'k', 'l',
    'm', 'n', 'o', 'p'
};

aes cipher;

struct aes * work;

work = &cipher;

make_state(work->state, test);

work->state[0] ^= 0xbc6378cd;
work->state[0] ^= 0xbc6378cd;

make_string(work->state, test);

And while this code works, doing the same thing by passing it to a function does not:

void encipher_block (struct aes * work, char in[]) {

    make_state(work->state, in);

    work->state[0] ^= 0xff00cd00;

    make_string(work->state, in);

}

void decipher_block (struct aes * work, char in[]) {

    make_state(work->state, in);

    work->state[0] ^= 0xff00cd00;

    make_string(work->state, in);

}

Yet, by removing the make_state and make_string calls in both encipher and decipher, it works as expected!

make_state(work->state, test);

encipher_block(&cipher, test);
decipher_block(&cipher, test);

make_string(work->state, test);

So to clarify, I do not have a problem! I just want to understand this behavior.

Upvotes: 4

Views: 264

Answers (2)

Eric Postpischil
Eric Postpischil

Reputation: 223663

Change char to unsigned char. char may be signed, and likely is on your system, which causes problems when converting to other integer types and when shifting.

In the expression (char) (in[i] >> 24) in make_string, an unsigned 32-bit integer is converted to a signed 8-bit integer (in your C implementation). This expression may convert values to a char that are not representable in a char, notably the values from 128 to 255. According to C 2011 6.3.1.3 3, the result is implementation-defined or an implementation-defined signal is raised.

In the expression (word) (in[(i * 4) + 3] ) in make_state, in[…] is a char, which is a signed 8-bit integer (in your C implementation). This char is converted to an int, per the usual integer promotions defined in C 2011 6.3.1.1 2. If the char is negative, then the resulting int is negative. Then, when it is converted to a word, which is unsigned, the effect is that the sign bit is replicated in the high 24 bits. For example, if the char has value -166 (0x90), the result will be 0xffffff90, but you want 0x00000090.

Change char to unsigned char throughout this code.

Additionally, in make_state, in[(i * 4) + 0] should be cast to word before the left shift. This is because it will start as an unsigned char, which is promoted to int before the shift. If it has some value with the high bit set, such as 0x80, then shifting it left 24 bits produces a value that cannot be represented in an int, such as 0x80000000. Per C 2011 6.5.7 4, the behavior is then undefined.

This will not be a problem in most C implementations; two’s complement is commonly used for signed integers, and the result will wrap as desired. Additionally, I expect this is a model situation that the compiler developers design for, since it is a very common code structure. However, to improve portability, casting to word will avoid the possibility of overflow.

Upvotes: 2

neirbowj
neirbowj

Reputation: 645

The make_state() function overwrites the array passed in the first argument. If you put the encipher_block() and decipher_block() bodies inline, you get this:

/* encipher_block inline */
make_state(work->state, in);
work->state[0] ^= 0xff00cd00;
make_string(work->state, in);

/* decipher_block inline */
make_state(work->state, in);    /* <-- Here's the problem */
work->state[0] ^= 0xff00cd00;
make_string(work->state, in);

Upvotes: 0

Related Questions