Reputation: 1904
I'm confused about how the OS actually "opens" a file in C when you do an fopen
in the code. To elaborate, suppose I had a 100 binary files (say of size 1 MB
) which I open in C
FILE **fptr;
fptr = calloc(100, sizeof(FILE *));
for (ii = 0; ii < 100; ii++)
fptr[ii] = fopen(filename[ii], "rb+");
Assume that filename
and ii
were already defined appropriately.
Will the OS load 100 MB
into memory, or does the above code just tell the program to keep these files ready for access?
Upvotes: 1
Views: 1614
Reputation: 3767
There are few things we need to look at first
this is typedefed as below
stdio.h:
typedef struct _IO_FILE FILE;
next _IO_FILE
_IO_FILE looks like below :
libio.h:
struct _IO_FILE {
int _flags; /* High-order word is _IO_MAGIC; rest is flags. */
#define _IO_file_flags _flags
/* The following pointers correspond to the C++ streambuf protocol. */
/* Note: Tk uses the _IO_read_ptr and _IO_read_end fields directly. */
char* _IO_read_ptr; /* Current read pointer */
char* _IO_read_end; /* End of get area. */
char* _IO_read_base; /* Start of putback+get area. */
char* _IO_write_base; /* Start of put area. */
char* _IO_write_ptr; /* Current put pointer. */
char* _IO_write_end; /* End of put area. */
char* _IO_buf_base; /* Start of reserve area. */
char* _IO_buf_end; /* End of reserve area. */
/* The following fields are used to support backing up and undo. */
char *_IO_save_base; /* Pointer to start of non-current get area. */
char *_IO_backup_base; /* Pointer to first valid character of backup area */
char *_IO_save_end; /* Pointer to end of non-current get area. */
struct _IO_marker *_markers;
struct _IO_FILE *_chain;
int _fileno;
#if 0
int _blksize;
#else
int _flags2;
#endif
_IO_off_t _old_offset; /* This used to be _offset but it's too small. */
#define __HAVE_COLUMN /* temporary */
/* 1+column number of pbase(); 0 is unknown. */
unsigned short _cur_column;
signed char _vtable_offset;
char _shortbuf[1];
/* char* _save_gptr; char* _save_egptr; */
_IO_lock_t *_lock;
#ifdef _IO_USE_OLD_IO_FILE
};
and the chain of internal pointers go on, however this file structure by no means keeps the data with itself what this maintains is a snapshot of record and presents to userspace. incase when user requests to read data from file the same FILE * will be used however the data will be retrieved by a syscall (in linux case read call), the offset state however will persisted here itself to go forward , backward. most importantly this provides a very good abstraction to internal system calls.
Upvotes: 0
Reputation: 6606
The file is not loaded into memory upon opening it. Instead, parts are loaded in for each read.A call to fopen should not cause reading the file content from media now fread will cause partial read (or complete read for small files) from the media. Partial read usually is equal to cache line size in cache manager.
Upvotes: 1
Reputation: 399753
The latter, no data is read from the file until needed, i.e. when you call fread()
or some other I/O function.
Of course the underlying operating system might decide to speculatively read data when the file is opened, to save time later, but that's outside your control so in effect it doesn't matter. I mean that it doesn't matter because any memory used by such speculative buffering will need to be immediately made available to applications on demand.
That said, it's not as if any practical system will let the fopen()
spend the time needed to read 100 MB though, that would be very bad engineering.
Also note that there might be limits on how many files a single process can open in parallel. 100 should be fine for most modern systems, though.
Upvotes: 3
Reputation: 647
It's implementation defined. You're not required to read the entire file in memory on fopen(). Most implementations these days use some form of mmap() to actually read the data into memory.
Upvotes: -2