jbf81tb
jbf81tb

Reputation: 865

How big is too big for the heap

My program is in C and I'm compiling with gcc. I'm reading in a file, and storing the contents of the file into a buffer. To do that I need the buffer to be as large as the file. I'm using malloc() to allocate memory for the buffer. Unfortunately, I ran into a file that's 277MB. Is that too much for the heap? I'm getting a seg fault at run time, but no more information than that. It's worked for files as large as 160 MB, but this single outlier of a 277MB file is breaking it.

EDIT: valgrind gives me

@0xC0000022L valgrind gives me

==6380== Warning: set address range perms: large range [0x8851028, 0x190e6102) (undefined)
==6380== Warning: set address range perms: large range [0x8851028, 0x190e6028) (defined)
==6380== Warning: set address range perms: large range [0x190e7028, 0x2997c108) (undefined)
==6380== Warning: set address range perms: large range [0x190e7028, 0x2997c028) (defined)
==6380== Warning: silly arg (-1737565464) to malloc()
==6380== Invalid write of size 4
==6380==    at 0x8048A49: main (newanalyze.c:85)
==6380==  Address 0x4a00 is not stack'd, malloc'd or (recently) free'd
==6380==
==6380==
==6380== Process terminating with default action of signal 11 (SIGSEGV)
==6380==  Access not within mapped region at address 0x4A00
==6380==    at 0x8048A49: main (newanalyze.c:85)

but at line 85 is just a small variable that shouldn't be affected by the size of the file.

Upvotes: 3

Views: 5133

Answers (3)

Derui Si
Derui Si

Reputation: 1105

Please pay attention to the output from Valgrind,

==6380== Warning: silly arg (-1737565464) to malloc()

-1737565464 is a signed int value, while it is 2557401832 (>2G) if it is taken as a unsigned int. You are passing a >2G parameter to the malloc instead of 277M.

And from the following information, we know that you are trying to write to the address 0x4a00 which is an invalid address, you would expect a SEGV in this scenario. Please check newanalyze.c:85 in your code to see what is it there.

==6380== Invalid write of size 4

==6380== at 0x8048A49: main (newanalyze.c:85)

==6380== Address 0x4a00 is not stack'd, malloc'd or (recently) free'd

Upvotes: 4

Norman Gray
Norman Gray

Reputation: 12514

Following up one of the comments, here's how you'd open a file using mmap(2). This is assuming you're on a unix.

int fd;
struct stat S;
const char* file_base;

if ((fd = open(filename, O_RDONLY)) < 0) {
    fprintf(stderr, "Can't open file %s to read\n", filename);
    return NULL;
}
if (fstat(fd, &S) != 0) {
    fprintf(stderr, "Can't stat file %s!\n", filename);
    close(fd);
    return NULL;
}
if ((file_base = mmap(NULL, S.st_size, PROT_READ, MAP_FILE|MAP_PRIVATE, fd, 0)) == MAP_FAILED) {
    fprintf(stderr, "Unable to map file %s\n", filename);
    close(fd);
    return NULL;
}

After this, file_base points to a bit of memory which contains the entire contents of the file.

The advantages of this way are:

  • It takes up essentially no memory (sort-of). What's actually happening here is that you're making the file the swap space for a bit of memory.
  • It takes no time (sort-of). The bit of newly-identified memory starts off swapped-out, so that when the memory is accessed, all that has to happen is that that bit of file is swapped in, which happens as quickly as it would if this were dynamically allocated with malloc(3).

This has the bit of memory marked read-only; you can also do this trick with the file read-write, though that means that if you write to the memory you'll simultaneously change the file.

If you're not on a unix, there might still be a mmap function available to you. If not, there'll be some windows-native way of doing the same thing (CreateFileMapping).

Upvotes: 1

Sam
Sam

Reputation: 2969

Unfortunately I can't give you a solid "why," but mmap2, which appearas to be what malloc is calling on your system, simply reports it's out of memory. Malloc will, in this case, return NULL cause a segfault.

munmap(0xb7706000, 4096)                = 0
mmap2(NULL, 2557403136, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1  ENOMEM (Cannot allocate memory)

As a counter example, I have a toy program that succeeds with:

mmap(NULL, 283652096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2d00994000

I would check the memory available on the system or that the program is using. Maybe it's leaking memory badly?

Upvotes: 0

Related Questions