Reputation: 111
I have to write C code for reading large files. The code is below:
int read_from_file_open(char *filename,long size)
{
long read1=0;
int result=1;
int fd;
int check=0;
long *buffer=(long*) malloc(size * sizeof(int));
fd = open(filename, O_RDONLY|O_LARGEFILE);
if (fd == -1)
{
printf("\nFile Open Unsuccessful\n");
exit (0);;
}
long chunk=0;
lseek(fd,0,SEEK_SET);
printf("\nCurrent Position%d\n",lseek(fd,size,SEEK_SET));
while ( chunk < size )
{
printf ("the size of chunk read is %d\n",chunk);
if ( read(fd,buffer,1048576) == -1 )
{
result=0;
}
if (result == 0)
{
printf("\nRead Unsuccessful\n");
close(fd);
return(result);
}
chunk=chunk+1048576;
lseek(fd,chunk,SEEK_SET);
free(buffer);
}
printf("\nRead Successful\n");
close(fd);
return(result);
}
The issue I am facing here is that as long as the argument passed (size parameter) is less than 264000000 bytes, it seems to be able to read. I am getting the increasing sizes of the chunk variable with each cycle.
When I pass 264000000 bytes or more, the read fails, i.e.: according to the check used read returns -1.
Can anyone point me to why this is happening? I am compiling using cc in normal mode, not using DD64.
Upvotes: 11
Views: 28238
Reputation: 378
In the first place, why do you need lseek()
in your cycle? read()
will advance the cursor in the file by the number of bytes read.
And, to the topic: long, and, respectively, chunk, have a maximum value of 2147483647
, any number greater than that will actually become negative.
You want to use off_t
to declare chunk: off_t chunk
, and size as size_t
.
That's the main reason why lseek()
fails.
And, then again, as other people have noticed, you do not want to free()
your buffer inside the cycle.
Note also that you will overwrite the data you have already read.
Additionally, read()
will not necessarily read as much as you have asked it to, so it is better to advance chunk by the amount of the bytes actually read, rather than amount of bytes you want to read.
Taking everything in regards, the correct code should probably look something like this:
// Edited: note comments after the code
#ifndef O_LARGEFILE
#define O_LARGEFILE 0
#endif
int read_from_file_open(char *filename,size_t size)
{
int fd;
long *buffer=(long*) malloc(size * sizeof(long));
fd = open(filename, O_RDONLY|O_LARGEFILE);
if (fd == -1)
{
printf("\nFile Open Unsuccessful\n");
exit (0);;
}
off_t chunk=0;
lseek(fd,0,SEEK_SET);
printf("\nCurrent Position%d\n",lseek(fd,size,SEEK_SET));
while ( chunk < size )
{
printf ("the size of chunk read is %d\n",chunk);
size_t readnow;
readnow=read(fd,((char *)buffer)+chunk,1048576);
if (readnow < 0 )
{
printf("\nRead Unsuccessful\n");
free (buffer);
close (fd);
return 0;
}
chunk=chunk+readnow;
}
printf("\nRead Successful\n");
free(buffer);
close(fd);
return 1;
}
I also took the liberty of removing result variable and all related logic since, I believe, it can be simplified.
Edit: I have noted that some systems (most notably, BSD) do not have O_LARGEFILE
, since it is not needed there. So, I have added an #ifdef in the beginning, which would make the code more portable.
Upvotes: 14
Reputation: 13504
If its 32 bit machine, it will cause some problem for reading a file of larger than 4gb. So if you are using gcc compiler try to use the macro -D_LARGEFILE_SOURCE=1
and -D_FILE_OFFSET_BITS=64
.
Please check this link also
If you are using any other compiler check for similar types of compiler option.
Upvotes: 0