Reputation: 295
I am looking for some platform-independent C code to determine the size of a given file. First, I read the following answer: https://stackoverflow.com/a/238607
The answer uses fseek with SEEK_END and ftell. Now, my problem is that I found the following C standard quotes.
Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.
and
A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END.
So, it looks like I have a problem. Possibly, the following code, which counts the number of read bytes, is a workaround.
file = fopen(file_path, "rb");
/* ... */
while ( EOF != fgetc(file) ) {
ret = size_t_inc(&file_size_); /* essentially, this does ++file_size_ */
/* ... */
}
ret = feof(file);
/* ... */
if (!ret) {
return 1; /* Error! */
}
(entire function here: https://github.com/630R6/bytelev/blob/8e3d0dd14042f16086f3ca4e9a33d49a0629630e/main.c#L138)
Still, I am looking for a better solution.
Thanks so much for your time!
Upvotes: 4
Views: 938
Reputation: 118118
First off, let me point out that the question, as stated, is a bit of a fool's errand: On platforms where there are files and such, there are well defined ways of accessing this information in the most efficient manner. Given the limited number of environments to which one must generally cater, abstracting away the functionality to simple platform specific functions is the way to go.
So, basically, reading the entire file, and counting bytes along the way is the only universal way so long as the "file" is of finite size.
One can tweak around the edges a bit (say, for a 10GB file, do we really want 10 billion fgetc calls?)
For example, below I used fread to read the file in chunks up to 64K in size:
#include <assert.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
static const size_t SIZE_T_MAX = (size_t) -1;
static const size_t CHUNK_SIZE = 64 * 1024;
int
calc_file_size_slowly(const char *const filename, size_t *size) {
FILE *fp;
int ret = -1;
size_t bytes_read = 0;
unsigned char *buffer = malloc(CHUNK_SIZE);
assert(buffer);
errno = 0;
fp = fopen(filename, "rb");
if (!fp) {
goto FAIL_FOPEN;
}
errno = 0;
*size = 0;
while ((bytes_read = fread(buffer, 1, CHUNK_SIZE, fp)) > 0) {
if (ferror(fp)) {
goto FAIL_FERROR;
}
if ((*size + bytes_read) > *size) {
(*size) += bytes_read;
}
else {
goto FAIL_OVERFLOW;
}
errno = 0;
}
if (feof(fp)) {
ret = 0;
goto DONE;
}
FAIL_FOPEN:
{
ret = errno;
goto DONE;
}
FAIL_FERROR:
{
ret = errno;
fclose(fp);
goto DONE;
}
FAIL_OVERFLOW:
{
ret = EOVERFLOW;
fclose(fp);
goto DONE;
}
DONE:
free(buffer);
return ret;
}
int main(int argc, char *argv[]) {
int i;
for (i = 1; i < argc; i += 1) {
size_t size;
if (calc_file_size_slowly(argv[i], &size) == 0) {
printf("%s: %lu bytes\n", argv[i], size);
}
}
}
Output:
C:\...\Temp> dir wpi.msi ... 2015-05-15 12:12 PM 1,859,584 wpi.msi ... C:\...\Temp> mysizer wpi.msi Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=65536 Bytes read=24576 wpi.msi: 1859584 bytes
Upvotes: 1