Reputation: 804
How can I read the contents of a file if I have to use the following parameters:
Overall, I am trying to compute the MD5 value of these parts (you can also call them as CHUNKS).
The start-value and length of the chunks have been computed and stored in a file.
I tried to use fread()
as follows, but it does not give me logical results
char *chunk_buffer;
//chunk_buffer is a pointer to a memory block
while(cur_poly != NULL) {
//cur_poly is a structure which is used to store the start and length of chunks
chunk_buffer = (char*) malloc ((cur_poly->length)*8);
//here I am trying to allocate memory based on the size of each chunk
int x=fread (chunk_buffer,1, cur_poly->length, c_file);
//c_file is the file to be read according to the offsets
char hash[32];
hash=md5(chunk_buffer);
//md5() is a function which can generate the md5 hash values for the chunks
}
Upvotes: 1
Views: 5727
Reputation: 32502
I want to note some more issues with that code. You might need to add some more details on these points.
If you want to read consecutive chunks from your file, you usually don't need to modify the get pointer of your file. Just read a chunk, and then read the next one. If you need to read the chunks in random order, you need to use fseek. This way you adjust the start position of the next file operation by an offset (from beginning, or end of the file, or relative to the current position).
You have a char pointer chunk_buffer
, that you obviously use to store the data from your file temporarily. That is, it's only valid for the current loop iteration.
If this is the case I would suggest to do the malloc
once before you enter the loop:
char * chunk_buffer = malloc (MAXIMUM_CHUNK_SIZE);
in the loop you may clear this buffer using memset
or just overwrite the data. Also note that malloc()
ed memory is not initialized with '\0'
values (I don't know if this is one assumption you rely on ...).
I am not sure, why you actually allocate a buffer of size length*8
and just read length
bytes to it. Probably
int x = fread (chunk_buffer, SIZE_OF_ITEM, THIS_CHUNK_SIZE, c_file);
would fit your needs closer, if your items are indeed larger than a byte.
It is unclear, what the md5()
function actually does. What value does it return? A pointer to a buffer that is allocated dynamically? A pointer to a local array? Anyway, you assign the return value to a pointer to a local array of char
s. You might not need to allocate 32
bytes for this, but just
char * hash = md5 (chunk_buffer);
Make sure that you keep the pointer to that array somewhere you find it when the loop takes the next iteration. An array that is created statically in local scope of that function can of course not be passed this way.
Your md5()
function. How does it know, what the size of a chunk is? It is passed a pointer, but not the size of the valid data (as far as I see it). You might need to adapt this function to take the length of the input array as additional parameter.
What does the md5()
function produce, a C-style string (alphanumeric digits, null-terminated) or an array of byte sized unsigned integers (uint8_t
) ?
make sure that you free()
the memory you allocate dynamically. If you want to keep the malloc()
inside the loop, make sure the loop always ends with
free (chunk_buffer);
For us to help you any further, you need to define a) what are logical results for you and b) what results do you get
Upvotes: 1
Reputation: 35530
I see two potential issues.
What units does cur_poly->length
represent? You are mallocing memory as if it is a count of 64-bit words, yet reading the file as if it is bytes. If the field represents length in bytes, then you are reading correctly, but allocating too much memory. However, if the field is length in 64-bit words, then you are allocating the right amount of memory, but only reading 1/8th the data.
The code seems to be ignoring offsets. (Or assuming all chunks must be contiguous). If you want to read from an arbitrary offset, do a fseek(fp, offset, SEEK_SET);
before the fread.
If the chunks are supposed to be contiguous, there still may be padding at the ends to force them all to start on an even boundary. You would have to seek over the padding whenever the byte count was odd (.WAV does this, as an example)
Upvotes: 3