Chris Bunch
Chris Bunch

Reputation: 89823

How to read the content of a file to a string in C?

What is the simplest way (least error-prone, least lines of code, however you want to interpret it) to open a file in C and read its contents into a string (char*, char[], whatever)?

Upvotes: 149

Views: 236050

Answers (13)

user15651507
user15651507

Reputation: 7

I just ran a bunch of tests comparing using seek, lseek, stat, and fstat also comparing using file streams and file descriptors to see what seems to be the fastest. For the test I create a 100M file.

TL;DR - using file descriptors, fstat and read was the fastest and using file streams and seek was the slowest.

For the test I ran this on a small Linux box I have running a headless ArchLinux server. I ran the test: checking the file size, malloc a buffer, read the entire file into the buffer, close the file, free the buffer.

I ran the test 3 times with 1000 cycles each time and using clock_gettime to calculate the elapsed time.

Just simply comparing JUST the time it takes to get the file size using stat or fstat were at least 30% faster than using seek or lseek.

Comparing just the speed of using file streams vs file descriptors, they were pretty nearly the same - descriptors were about 1-3% faster.

In comparing getting the file size, opening the file, malloc a buffer, read the entire 100M, close the file and free the buffer -- using file descriptors and fstat were 6-8% faster than using seek or lseek. Probably because the bulk of the time is spent in the file read vs the getting the file size, which dilutes the overall performance benefit.

BTW - do not use fgetc and read the file 1 character at a time. This is crazy inefficient and really really slow! Like 1700% slower!!!!

Upvotes: -2

dmityugov
dmityugov

Reputation: 4478

If "read its contents into a string" means that the file does not contain characters with code 0, you can also use getdelim() function, that either accepts a block of memory and reallocates it if necessary, or just allocates the entire buffer for you, and reads the file into it until it encounters a specified delimiter or end of file. Just pass '\0' as the delimiter to read the entire file.

This function is available in the GNU C Library.

The sample code might look as simple as:

char* buffer = NULL;
size_t len;
ssize_t bytes_read = getdelim(&buffer, &len, '\0', fp);

if (bytes_read != -1) {
    /* Success, now the entire file is in the buffer */

Upvotes: 23

SleepyCal
SleepyCal

Reputation: 5993

If you're using glib, then you can use g_file_get_contents;

gchar *contents;
GError *err = NULL;

g_file_get_contents ("foo.txt", &contents, NULL, &err);
g_assert ((contents == NULL && err != NULL) || (contents != NULL && err == NULL));
if (err != NULL)
  {
    // Report error to user, and free error
    g_assert (contents == NULL);
    fprintf (stderr, "Unable to read file: %s\n", err->message);
    g_error_free (err);
  }
else
  {
    // Use file contents
    g_assert (contents != NULL);
  }
}

Upvotes: 3

chux
chux

Reputation: 153456

What is the simplest way (least error-prone, least lines of code, however you want to interpret it) to open a file in C and read its contents into a string ...?

Sadly, even after years, answers are error prone and many lack proper string formation and error checking.

#include <stdio.h>
#include <stdlib.h>

// Read the file into allocated memory.
// Return NULL on error.
char* readfile(FILE *f) {
  // f invalid? fseek() fail?
  if (f == NULL || fseek(f, 0, SEEK_END)) {
    return NULL;
  }

  long length = ftell(f);
  rewind(f);
  // Did ftell() fail?  Is the length too long?
  if (length == -1 || (unsigned long) length >= SIZE_MAX) {
    return NULL;
  }

  // Convert from long to size_t
  size_t ulength = (size_t) length;
  char *buffer = malloc(ulength + 1);
  // Allocation failed? Read incomplete?
  if (buffer == NULL || fread(buffer, 1, ulength, f) != ulength) {
    free(buffer);
    return NULL;
  }
  buffer[ulength] = '\0'; // Now buffer points to a string

  return buffer;
}

Note that if the text file contains null characters, the allocated data will contain all the file data, yet the string will appear to be short. Better code would also return the length information so the caller can handle that.

char* readfile(FILE *f, size_t *ulength_ptr) {
  ...
  if (ulength_ptr) *ulength_ptr == *ulength;
  ...
} 

As the string is allocated, be sure to free the returned pointer when done.

Upvotes: 6

Joe Cool
Joe Cool

Reputation: 225

Note: This is a modification of the accepted answer above.

Here's a way to do it, complete with error checking.

I've added a size checker to quit when file was bigger than 1 GiB. I did this because the program puts the whole file into a string which may use too much ram and crash a computer. However, if you don't care about that you could just remove it from the code.

#include <stdio.h>
#include <stdlib.h>

#define FILE_OK 0
#define FILE_NOT_EXIST 1
#define FILE_TOO_LARGE 2
#define FILE_READ_ERROR 3

char * c_read_file(const char * f_name, int * err, size_t * f_size) {
    char * buffer;
    size_t length;
    FILE * f = fopen(f_name, "rb");
    size_t read_length;
    
    if (f) {
        fseek(f, 0, SEEK_END);
        length = ftell(f);
        fseek(f, 0, SEEK_SET);
        
        // 1 GiB; best not to load a whole large file in one string
        if (length > 1073741824) {
            *err = FILE_TOO_LARGE;
            
            return NULL;
        }
        
        buffer = (char *)malloc(length + 1);
        
        if (length) {
            read_length = fread(buffer, 1, length, f);
            
            if (length != read_length) {
                 free(buffer);
                 *err = FILE_READ_ERROR;

                 return NULL;
            }
        }
        
        fclose(f);
        
        *err = FILE_OK;
        buffer[length] = '\0';
        *f_size = length;
    }
    else {
        *err = FILE_NOT_EXIST;
        
        return NULL;
    }
    
    return buffer;
}

And to check for errors:

int err;
size_t f_size;
char * f_data;

f_data = c_read_file("test.txt", &err, &f_size);

if (err) {
    // process error
}
else {
    // process data
    free(f_data);
}

Upvotes: 6

easy and neat(assuming contents in the file are less than 10000):

void read_whole_file(char fileName[1000], char buffer[10000])
{
    FILE * file = fopen(fileName, "r");
    if(file == NULL)
    {
        puts("File not found");
        exit(1);
    }
    char  c;
    int idx=0;
    while (fscanf(file , "%c" ,&c) == 1)
    {
        buffer[idx] = c;
        idx++;
    }
    buffer[idx] = 0;
}

Upvotes: -3

Erik Campobadal
Erik Campobadal

Reputation: 917

I will add my own version, based on the answers here, just for reference. My code takes into consideration sizeof(char) and adds a few comments to it.

// Open the file in read mode.
FILE *file = fopen(file_name, "r");
// Check if there was an error.
if (file == NULL) {
    fprintf(stderr, "Error: Can't open file '%s'.", file_name);
    exit(EXIT_FAILURE);
}
// Get the file length
fseek(file, 0, SEEK_END);
long length = ftell(file);
fseek(file, 0, SEEK_SET);
// Create the string for the file contents.
char *buffer = malloc(sizeof(char) * (length + 1));
buffer[length] = '\0';
// Set the contents of the string.
fread(buffer, sizeof(char), length, file);
// Close the file.
fclose(file);
// Do something with the data.
// ...
// Free the allocated string space.
free(buffer);

Upvotes: 0

Jeff Mc
Jeff Mc

Reputation: 3793

Another, unfortunately highly OS-dependent, solution is memory mapping the file. The benefits generally include performance of the read, and reduced memory use as the applications view and operating systems file cache can actually share the physical memory.

POSIX code would look like this:

int fd = open("filename", O_RDONLY);
int len = lseek(fd, 0, SEEK_END);
void *data = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);

Windows on the other hand is little more tricky, and unfortunately I don't have a compiler in front of me to test, but the functionality is provided by CreateFileMapping() and MapViewOfFile().

Upvotes: 43

BaiJiFeiLong
BaiJiFeiLong

Reputation: 4615

Just modified from the accepted answer above.

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

char *readFile(char *filename) {
    FILE *f = fopen(filename, "rt");
    assert(f);
    fseek(f, 0, SEEK_END);
    long length = ftell(f);
    fseek(f, 0, SEEK_SET);
    char *buffer = (char *) malloc(length + 1);
    buffer[length] = '\0';
    fread(buffer, 1, length, f);
    fclose(f);
    return buffer;
}

int main() {
    char *content = readFile("../hello.txt");
    printf("%s", content);
}

Upvotes: 3

Entalpi
Entalpi

Reputation: 2033

// Assumes the file exists and will seg. fault otherwise.
const GLchar *load_shader_source(char *filename) {
  FILE *file = fopen(filename, "r");             // open 
  fseek(file, 0L, SEEK_END);                     // find the end
  size_t size = ftell(file);                     // get the size in bytes
  GLchar *shaderSource = calloc(1, size);        // allocate enough bytes
  rewind(file);                                  // go back to file beginning
  fread(shaderSource, size, sizeof(char), file); // read each char into ourblock
  fclose(file);                                  // close the stream
  return shaderSource;
}

This is a pretty crude solution because nothing is checked against null.

Upvotes: 1

Jake
Jake

Reputation: 2236

If you are reading special files like stdin or a pipe, you are not going to be able to use fstat to get the file size beforehand. Also, if you are reading a binary file fgets is going to lose the string size information because of embedded '\0' characters. Best way to read a file then is to use read and realloc:

#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>

int main () {
    char buf[4096];
    ssize_t n;
    char *str = NULL;
    size_t len = 0;
    while (n = read(STDIN_FILENO, buf, sizeof buf)) {
        if (n < 0) {
            if (errno == EAGAIN)
                continue;
            perror("read");
            break;
        }
        str = realloc(str, len + n + 1);
        memcpy(str + len, buf, n);
        len += n;
        str[len] = '\0';
    }
    printf("%.*s\n", len, str);
    return 0;
}

Upvotes: 10

selwyn
selwyn

Reputation: 1204

If the file is text, and you want to get the text line by line, the easiest way is to use fgets().

char buffer[100];
FILE *fp = fopen("filename", "r");                 // do not use "rb"
while (fgets(buffer, sizeof(buffer), fp)) {
... do something
}
fclose(fp);

Upvotes: 4

Nils Pipenbrinck
Nils Pipenbrinck

Reputation: 86343

I tend to just load the entire buffer as a raw memory chunk into memory and do the parsing on my own. That way I have best control over what the standard lib does on multiple platforms.

This is a stub I use for this. you may also want to check the error-codes for fseek, ftell and fread. (omitted for clarity).

char * buffer = 0;
long length;
FILE * f = fopen (filename, "rb");

if (f)
{
  fseek (f, 0, SEEK_END);
  length = ftell (f);
  fseek (f, 0, SEEK_SET);
  buffer = malloc (length);
  if (buffer)
  {
    fread (buffer, 1, length, f);
  }
  fclose (f);
}

if (buffer)
{
  // start to process your data / extract strings here...
}

Upvotes: 201

Related Questions