user1799795
user1799795

Reputation: 279

How do I parse a string in C?

I am a beginner learning C; so, please go easy on me. :)

I am trying to write a very simple program that takes each word of a string into a "Hi (input)!" sentence (it assumes you type in names). Also, I am using arrays because I need to practice them.

My problem is that, some garbage gets putten into the arrays somewhere, and it messes up the program. I tried to figure out the problem but to no avail; so, it is time to ask for expert help. Where have I made mistakes?

p.s.: It also has an infinite loop somewhere, but it is probably the result of the garbage that is put into the array.

#include <stdio.h>
#define MAX 500 //Maximum Array size.

int main(int argc, const char * argv[])
{
    int stringArray [MAX];
    int wordArray [MAX];
    int counter = 0;
    int wordCounter = 0;

    printf("Please type in a list of names then hit ENTER:\n");  
    // Fill up the stringArray with user input.
    stringArray[counter] = getchar();
    while (stringArray[counter] != '\n') {
        stringArray[++counter] = getchar();
    }

    // Main function.
    counter = 0;
    while (stringArray[wordCounter] != '\n') {     
        // Puts first word into temporary wordArray.
        while ((stringArray[wordCounter] != ' ') && (stringArray[wordCounter] != '\n')) {
            wordArray[counter++] = stringArray[wordCounter++];
        }
        wordArray[counter] = '\0';

        //Prints out the content of wordArray.
        counter = 0;
        printf("Hi ");
        while (wordArray[counter] != '\0') {
            putchar(wordArray[counter]);
            counter++;
        }
        printf("!\n");

        //Clears temporary wordArray for new use.
        for (counter = 0; counter == MAX; counter++) {
            wordArray[counter] = '\0';
        } 
        wordCounter++;
        counter = 0; 
    }
    return 0;
}

Solved it! I needed to add to following if sentence to the end when I incremented the wordCounter. :)

    if (stringArray[wordCounter] != '\n') {
            wordCounter++;
    }

Upvotes: 4

Views: 11409

Answers (3)

corlettk
corlettk

Reputation: 13574

user1799795,

For what it's worth (now that you've solved your problem) I took the liberty of showing you how I'd do this given the restriction "use arrays", and explaining a bit about why I'd do it that way... Just beware that while I am experienced programmer I'm no C guru... I've worked with guys who absolutely blew me into the C-weeds (pun intended).

#include <stdio.h>
#include <string.h>

#define LINE_SIZE 500
#define MAX_WORDS 50
#define WORD_SIZE 20

// Main function.
int main(int argc, const char * argv[])
{
    int counter = 0;

    // ----------------------------------
    // Read a line of input from the user (ie stdin)
    // ----------------------------------
    char line[LINE_SIZE];
    printf("Please type in a list of names then hit ENTER:\n");
    while ( fgets(line, LINE_SIZE, stdin) == NULL )
        fprintf(stderr, "You must enter something. Pretty please!");

    // A note on that LINE_SIZE parameter to the fgets function:
    // wherever possible it's a good idea to use the version of the standard
    // library function that allows you specificy the maximum length of the
    // string (or indeed any array) because that dramatically reduces the
    // incedence "string overruns", which are a major source of bugs in c
    // programmes.
    // Also note that fgets includes the end-of-line character/sequence in
    // the returned string, so you have to ensure there's room for it in the
    // destination string, and remember to handle it in your string processing.

    // -------------------------
    // split the line into words
    // -------------------------

    // the current word
    char word[WORD_SIZE];
    int wordLength = 0;

    // the list of words
    char words[MAX_WORDS][WORD_SIZE]; // an array of upto 50 words of
                                      // upto 20 characters each
    int wordCount = 0;                // the number of words in the array.


    // The below loop syntax is a bit cyptic.
    // The "char *c=line;" initialises the char-pointer "c" to the start of "line".
    // The " *c;" is ultra-shorthand for: "is the-char-at-c not equal to zero".
    //   All strings in c end with a "null terminator" character, which has the
    //   integer value of zero, and is commonly expressed as '\0', 0, or NULL
    //   (a #defined macro). In the C language any integer may be evaluated as a
    //   boolean (true|false) expression, where 0 is false, and (pretty obviously)
    //   everything-else is true. So: If the character at the address-c is not
    //   zero (the null terminator) then go-round the loop again. Capiche?
    // The "++c" moves the char-pointer to the next character in the line. I use
    // the pre-increment "++c" in preference to the more common post-increment
    // "c++" because it's a smidge more efficient.
    //
    // Note that this syntax is commonly used by "low level programmers" to loop
    // through strings. There is an alternative which is less cryptic and is
    // therefore preferred by most programmers, even though it's not quite as
    // efficient. In this case the loop would be:
    //    int lineLength = strlen(line);
    //    for ( int i=0; i<lineLength; ++i)
    // and then to get the current character
    //        char ch = line[i];
    // We get the length of the line once, because the strlen function has to
    // loop through the characters in the array looking for the null-terminator
    // character at its end (guess what it's implementation looks like ;-)...
    // which is inherently an "expensive" operation (totally dependant on the
    // length of the string) so we atleast avoid repeating this operation.
    //
    // I know I might sound like I'm banging on about not-very-much but once you
    // start dealing with "real word" magnitude datasets then such habits,
    // formed early on, pay huge dividends in the ability to write performant
    // code the first time round. Premature optimisation is evil, but my code
    // doesn't hardly ever NEED optimising, because it was "fairly efficient"
    // to start with. Yeah?

    for ( char *c=line; *c; ++c ) {    // foreach char in line.

        char ch = *c;  // "ch" is the character value-at the-char-pointer "c".

        if ( ch==' '               // if this char is a space,
          || ch=='\n'              // or we've reached the EOL char
        ) {
            // 1. add the word to the end of the words list.
            //    note that we copy only wordLength characters, instead of
            //    relying on a null-terminator (which doesn't exist), as we
            //    would do if we called the more usual strcpy function instead.
            strncpy(words[wordCount++], word, wordLength);
            // 2. and "clear" the word buffer.
            wordLength=0;
        } else if (wordLength==WORD_SIZE-1) { // this word is too long
            // so split this word into two words.
            strncpy(words[wordCount++], word, wordLength);
            wordLength=0;
            word[wordLength++] = ch;
        } else {
            // otherwise: append this character to the end of the word.
            word[wordLength++] = ch;
        }
    }

    // -------------------------
    // print out the words
    // -------------------------

    for ( int w=0; w<wordCount; ++w ) {
        printf("Hi %s!\n", words[w]);
    }
    return 0;
}

In the real world one can't make such restrictive assumptions about the maximum-length of words, or how many there will be, and if such restrictions are given they're almost allways arbitrary and therefore proven wrong all too soon... so straight-off-the-bat for this problem, I'd be inclined to use a linked-list instead of the "words" array... wait till you get to "dynamic data structures"... You'll love em ;-)

Cheers. Keith.

PS: You're going pretty well... My advise is "just keep on truckin"... this gets a LOT easier with practice.

Upvotes: 2

ouah
ouah

Reputation: 145829

    for (counter = 0; counter == MAX; counter++) {
        wordArray[counter] = '\0';
    } 

You never enter into this loop.

Upvotes: 3

unwind
unwind

Reputation: 399703

You are using int arrays to represent strings, probably because getchar() returns in int. However, strings are better represented as char arrays, since that's what they are, in C. The fact that getchar() returns an int is certainly confusing, it's because it needs to be able to return the special value EOF, which doesn't fit in a char. Therefore it uses int, which is a "larger" type (able to represent more different values). So, it can fit all the char values, and EOF.

With char arrays, you can use C's string functions directly:

char stringArray[MAX];

if(fgets(stringArray, sizeof stringArray, stdin) != NULL)
   printf("You entered %s", stringArray);

Note that fscanf() will leave the end of line character(s) in the string, so you might want to strip them out. I suggest implementing an in-place function that trims off leading and trailing whitespace, it's a good exercise as well.

Upvotes: 4

Related Questions