Gary
Gary

Reputation: 11

Storing a string in char* in C

In the code below, I hope you can see that I have a char* variable and that I want to read in a string from a file. I then want to pass this string back from the function. I'm rather confused by pointers so I'm not too sure what I'm supposed to do really.

The purpose of this is to then pass the array to another function to be searched for a name.

Unfortunately the program crashes as a result and I've no idea why.

char* ObtainName(FILE *fp)
{
    char* temp;
    int i = 0;

    temp = fgetc(fp);
    while(temp != '\n')
    {
        temp = fgetc(fp);
        i++;
    }
    printf("%s", temp);
    return temp;
}

Any help would be vastly appreciated.

Upvotes: 0

Views: 1202

Answers (4)

John Bode
John Bode

Reputation: 123458

So, there are several problems here:

  1. You're not setting aside any storage for the string contents;
  2. You're not storing the string contents correctly;
  3. You're attempting to read memory that doesn't belong to you;
  4. The way you're attempting to return the string is going to give you heartburn.

1. You're not setting aside storage for the string contents

The line

char *temp;

declares temp as a pointer to char; its value will be the address of a single character value. Since it's declared at local scope without the static keyword, its initial value will be indeterminate, and that value may not correspond to a valid memory address.

It does not set aside any storage for the string contents read from fp; that would have to be done as a separate step, which I'll get to below.

2. You're not storing the string contents correctly

The line

temp = fgetc(fp);

reads the next character from fp and assigns it to temp. First of all, this means you're only storing the last character read from the stream, not the whole string. Secondly, and more importantly, you're assigning the result of fgetc() (which returns a value of type int) to an object of type char * (which is treated as an address). You're basically saying "I want to treat the letter 'a' as an address into memory." This brings us to...

3. You're attempting to read memory that doesn't belong to you

In the line

printf("%s", temp);

you're attempting to print out the string beginning at the address stored in temp. Since the last thing you wrote to temp was most likely a character whose value is < 127, you're telling printf to start at a very low and most likely not accessible address, hence the crash.

4. The way you're attempting to return the string is guaranteed to give you heartburn

Since you've defined the function to return a char *, you're going to need to do one of the following:

  • Allocate memory dynamically to store the string contents, and then pass the responsibility of freeing that memory on to the function calling this one;
  • Declare an array with the static keyword so that the array doesn't "go away" after the function exits; however, this approach has serious drawbacks;
  • Change the function definition;

Allocate memory dynamically

You could use dynamic memory allocation routines to set aside a region of storage for the string contents, like so:

char *temp = malloc( MAX_STRING_LENGTH * sizeof *temp );

or

char *temp = calloc( MAX_STRING_LENGTH, sizeof *temp );

and then return temp as you've written.

Both malloc and calloc set aside the number of bytes you specify; calloc will initialize all those bytes to 0, which takes a little more time, but can save your bacon, especially when dealing with text.

The problem is that somebody has to deallocate this memory when its no longer needed; since you return the pointer, whoever calls this function now has the responsibility to call free() when it's done with that string, something like:

void Caller( FILE *fp )
{
  ...
  char *name = ObtainName( fo );
  ...
  free( name );
  ...
}

This spreads the responsibility for memory management around the program, increasing the chances that somebody will forget to release that memory, leading to memory leaks. Ideally, you'd like to have the same function that allocates the memory free it.

Use a static array

You could declare temp as an array of char and use the static keyword:

static char temp[MAX_STRING_SIZE];

This will set aside MAX_STRING_SIZE characters in the array when the program starts up, and it will be preserved between calls to ObtainName. No need to call free when you're done.

The problem with this approach is that by creating a static buffer, the code is not re-entrant; if ObtainName called another function which in turn called ObtainName again, that new call will clobber whatever was in the buffer before.

Why not just declare temp as

char temp[MAX_STRING_SIZE];

without the static keyword? The problem is that when ObtainName exits, the temp array ceases to exist (or rather, the memory it was using is available for someone else to use). That pointer you return is no longer valid, and the contents of the array may be overwritten before you can access it again.

Change the function definition

Ideally, you'd like for ObtainName to not have to worry about the memory it has to write to. The best way to achieve that is for the caller to pass target buffer as a parameter, along with the buffer's size:

int ObtainName( FILE *fp, char *buffer, size_t bufferSize )
{
  ...
}

This way, ObtainName writes data into the location that the caller specifies (useful if you want to obtain multiple names for different purposes). The function will return an integer value, which can be a simple success or failure, or an error code indicating why the function failed, etc.

Note that if you're reading text, you don't have to read character by character; you can use functions like fgets() or fscanf() to read an entire string at a time.

Use fscanf if you want to read whitespace-delimited strings (i.e., if the input file contains "This is a test", fscanf( fp, "%s", temp); will only read "This"). If you want to read an entire line (delimited by a newline character), use fgets().

Assuming you want to read an individual string at a time, you'd use something like the following (assumes C99):

#define FMT_SIZE 20
...
int ObtainName( FILE *fp, char *buffer, size_t bufsize )
{
  int result = 1;  // assume success
  int scanfResult = 0;

  char fmt[FMT_SIZE];    
  sprintf( fmt, "%%%zus", bufsize - 1 ); 

  scanfResult = fscanf( fp, fmt, buffer );
  if ( scanfResult == EOF )
  {
    // hit end-of-file before reading any text
    result = 0;
  }
  else if ( scanfResult == 0 )
  {
    // did not read anything from input stream
    result = 0;
  }
  else
  {
    result = 1;
  }

  return result;
}

So what's this noise

char fmt[FMT_SIZE];    
sprintf( fmt, "%%%zus", bufsize - 1 ); 

about? There is a very nasty security hole in fscanf() when you use the %s or %[ conversion specifiers without a maximum length specifier. The %s conversion specifier tells fscanf to read characters until it sees a whitespace character; if there are more non-whitespace characters in the stream than the buffer is sized to hold, fscanf will store those extra characters past the end of the buffer, clobbering whatever memory is following it. This is a common malware exploit. So we want to specify a maximum length for the input; for example, %20s says to read no more than 20 characters from the stream and store them to the buffer.

Unfortunately, since the buffer length is passed in as an argument, we can't write something like %20s, and fscanf doesn't give us a way to specify the length as an argument the way fprintf does. So we have to create a separate format string, which we store in fmt. If the input buffer length is 10, then the format string will be %10s. If the input buffer length is 1000, then the format string will be %1000s.

Upvotes: 1

Brad Budlong
Brad Budlong

Reputation: 1785

I've taken some liberties with what you wanted to accomplish. Rather that deal with pointers, you can just use a fixed sized array as long as you can set a maximum length. I've also included several checks so that you don't run off the end of the buffer or the end of the file. Also important is to make sure that you have a null termination '\0' at the end of the string.

#define MAX_LEN 100
char* ObtainName(FILE *fp)
{
    static char temp[MAX_LEN];
    int i = 0;

    while(i < MAX_LEN-1)
    {
        if (feof(fp)) 
        {
            break;
        }
        temp[i] = fgetc(fp);
        if (temp[i] == '\n')
        {
            break;
        }
        i++;
    }
    temp[i] = '\0';
    printf("%s", temp);
    return temp;
}

Upvotes: 1

Mahonri Moriancumer
Mahonri Moriancumer

Reputation: 6003

The following code expands on that in your question, and returns the string in allocated storage:

char* ObtainName(FILE *fp)
   {
   int temp;
   int i = 1;
   char *string = malloc(i);
   if(NULL == string)
      {
      fprintf(stderr, "malloc() failed\n");
      goto CLEANUP;
      }
   *string = '\0';       

   temp = fgetc(fp);
   while(temp != '\n')
      {
      char *newMem;
      ++i;

      newMem=realloc(string, i);
      if(NULL==newMem)
         {
         fprintf(stderr, "realloc() failed.\n");
         goto CLEANUP;
         }
      string=newMem;
      string[i-1] = temp;
      string[i] = '\0';

      temp = fgetc(fp);
      }

CLEANUP:

   printf("%s", string);
   return(string);
   }

Take care to 'free()' the string returned by this function, or a memory leak will occur.

Upvotes: 0

Ed Swangren
Ed Swangren

Reputation: 124642

fgetc returns an int, not a char*. This int is a character from the stream, or EOF if you reach the end of the file.

You're implicitly casting the int to a char*, i.e., interpreting it as an address (turn your warnings on.) When you call printf it reads that address and continues to read a character at a time looking for the null terminator which ends the string, but that address is almost certainly invalid. This is undefined behavior.

Upvotes: 1

Related Questions