Sam Symmes
Sam Symmes

Reputation: 43

How are ints in c stored in memory?

I thought that ints in C were stored with the most significant bit first, for example, the number 5 would be 0...0101. I thought I could manipulate specific bits by coercing C to let me pretend a specific memory address was an int and adding to the bits there like they were an int.

I tried to set up 0 bits of 0s in memory, then tried adding 255 to different memory addresses and it seems to work as though the least significant digit is stored in memory before the most significant digit because when I added 1 to my memory address and changed the bits there I got a larger number instead of a smaller one. If the most significant bit was stored earlier in memory, adding 255 to the memory address 1 byte higher over shouldn't affect the number at the original address at all because the last 8 bits are the beginning of the next int. I was wondering if I was interpreting this correctly, and that ints were stored with the least significant bit first.

#include "stdio.h"
#include "string.h"
#include "stdlib.h"

int main() {
    int *x = malloc(8); //getting 4 memory addresses
    int *y = malloc(8);
    int *z = malloc(8);
    int *a = malloc(8);

    x[0] = 0; //setting 64 bits past memory addresses to 0s
    x[1] = 0;
    y[0] = 0;
    y[1] = 0;
    z[0] = 0;
    z[1] = 0;
    a[0] = 0;
    a[1] = 0;

    *((int*)((int)x)) = 255; //adding to x's memory address
    *((int*)((int)y + 1)) = 255; //adding 1 byte over from y
    *((int*)((int)z + 2)) = 255; //adding 2 bytes over from z
    *((int*)((int)a + 3)) = 255; //adding 3 bytes over from a

    printf("%d\n", sizeof(int));
    printf("%d,%d\n", x[0], x[1]);
    printf("%d,%d\n", y[0], y[1]);
    printf("%d,%d\n", z[0], z[1]);
    printf("%d,%d\n", a[0], a[1]);

    printf("%d\n", x);
    printf("%d\n", &x[1]);
    return 0;
}

Expected output:

4
255,0
0,-16777216
0,16711680
0,65280
12784560
12784564

Actual Output:

4
255,0
65280,0
16711680,0
-16777216,0
12784560
12784564

Upvotes: 4

Views: 1792

Answers (2)

chqrlie
chqrlie

Reputation: 145307

There are some problems in your code:

  • there seems to be some confusion between bit and byte. Computer memory is addressable as bytes, usually comprising 8 bits on current architectures.

  • you should not cast pointer to int, int might not have enough range to accommodate for a pointer's value. convert the pointers to unsigned char * to patch individual bytes, but be aware that this might not yield the expected results due to the aliasing rule:

    ((unsigned char *)x)[0] = 255; //adding to x's memory address
    ((unsigned char *)y)[1] = 255; //adding 1 byte over from y
    ((unsigned char *)z)[2] = 255; //adding 2 bytes over from z
    ((unsigned char *)a)[3] = 255; //adding 3 bytes over from a
    
  • Similarly, you should use %zu to print a size_t or convert the size_t to int.

  • pointers should be cast as (void*) and printed with %p.
  • the effect of your changes would be more obvious if printing the int values in hex.

Here is a modified version:

#include <stdio.h>
#include <stdlib.h>

int main() {
    // getting 4 memory addresses, each with enough space for 2 int, initialized to 0
    int *x = calloc(2, sizeof(int));
    int *y = calloc(2, sizeof(int));
    int *z = calloc(2, sizeof(int));
    int *a = calloc(2, sizeof(int));

    ((unsigned char *)x)[0] = 255; //adding to x's memory address
    ((unsigned char *)y)[1] = 255; //adding 1 byte over from y
    ((unsigned char *)z)[2] = 255; //adding 2 bytes over from z
    ((unsigned char *)a)[3] = 255; //adding 3 bytes over from a

    printf("%d\n", (int)sizeof(int));
    printf("%08x,%08x -- %d,%d\n", x[0], x[1], x[0], x[1]);
    printf("%08x,%08x -- %d,%d\n", y[0], y[1], y[0], y[1]);
    printf("%08x,%08x -- %d,%d\n", z[0], z[1], z[0], z[1]);
    printf("%08x,%08x -- %d,%d\n", a[0], a[1], a[0], a[1]);

    printf("%p\n", (void *)x);
    printf("%p\n", (void *)&x[1]);
    return 0;
}

Output:

4
000000ff,00000000 -- 255,0
0000ff00,00000000 -- 65280,0
00ff0000,00000000 -- 16711680,0
ff000000,00000000 -- -16777216,0
0x7fd42ec02630
0x7fd42ec02634

From the above output, we can see that:

  • type int has 4 bytes
  • pointers use 8 bytes (my environment is 64-bit, unlike yours)
  • int are stored with the lowest significant byte first, same as yours, which is called little-endian architecture.

You were expecting the opposite, big-endian architecture, which is quite rare on current desktop and laptop computers, but very common on embedded architectures and mobile phones.

Both approaches have advantages and disadvantages. C supports both transparently, so most programmers are not aware of these intricacies, yet understanding these implementation details is very useful in some situations:

  • systems programming, low level programming, device driver development,
  • image processing
  • reading and writing binary files and streams,
  • handling network transmission of binary data, especially to and from different devices.
  • interfacing with other programming languages, writing libraries, etc.

Upvotes: 0

Lightness Races in Orbit
Lightness Races in Orbit

Reputation: 385385

I thought that ints in c were stored with the most significant bit first, for example, the number 5 would be 0...0101

No, this depends on your platform & toolchain, not on C.

The scheme you describe (almost) is called big-endian.

Many commodity PCs nowadays are little-endian, so the opposite (least significant byte first). This may be the case for you.

Note that endianness talks about bytes, not bits.

It would be better not to try to manipulate data like this. Work with the language, using logical operations that don't care about endianness.

Upvotes: 5

Related Questions