Karan Desai
Karan Desai

Reputation: 1

How to find number of times that a given word occurs in a sentence [C code]?

Here is my code. I need to find out the number of times a given word(a short string) occurs in a sentence(a long string). Sample Input: the the cat sat on the mat Sample Output: 2 For some reason the string compare function is not working and my output is coming as zero. Kindly ignore the comments in the code as they have been put to debug the code.

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

int main(){
    char word[50];
    gets(word);
    int len = strlen(word);
    //printf("%d",len);
    char nword[len];
    char s[100];
    strcpy(nword,word);
    puts(nword);
    printf("\n");
    gets(s);
    //printf("%d",strlen(s));
    char a[50][50];
    int i,j,k;
    j = 0;
    for(i=0;i<strlen(s);i++)
    {
        a[i][j] = s[i];
        printf("%c",a[i][j]);
        if(s[i] == ' ')
        {
            j++;
            printf("\n");
        }
    }
    printf("%d",j);
    k = j;
    //printf("\nk assigned\n");
    j = 0;
    //printf("j equal to zero\n");
    int count = 0;
    int temp = 0;
    //printf("count initialized.\n");
    for(i=0;i<k;i++)
    {
        if(strcmp(a[i],nword) == 0)
            count++;
    }
    printf("\n%d",count);
    return 0;
}

Upvotes: 0

Views: 1968

Answers (4)

gmug
gmug

Reputation: 795

I would implement this using the functions strtok() and strcmp():

int main(void)
{
    char  word[] = "the"; /* the word you want to count*/
    char  sample[] = "the cat sat on the mat"; /* the string in which you want to count*/

    char  delimiters[] = " ,;.";
    int   counter;
    char* currentWordPtr;   

    /* tokenize the string */
    currentWordPtr = strtok(sample, delimiters);  

    while(currentWordPtr != NULL)
    {
        if(strcmp(word, currentWordPtr) == 0)
        {
            counter++;
        }  
        /* get the next token (word) */              
        currentWordPtr = strtok(NULL, delimiters);
    }

    printf("Number of occurences of \"%s\" is %i\n", word, counter); 

    return 0;
}

Upvotes: 0

bitflip-at
bitflip-at

Reputation: 149

I think you use your 2-dimensional array wrong. a[0][j] should be the first word from s[i]. But what you are doing is a[i][0] = s[i] which makes no sense to me.

Best regards

Upvotes: 0

Chris Turner
Chris Turner

Reputation: 8142

Your main problem is with this loop for numerous reasons

int i,j,k;
j = 0;
for(i=0;i<strlen(s);i++)
{
    a[i][j] = s[i];
    printf("%c",a[i][j]);
    if(s[i] == ' ')
    {
        j++;
        printf("\n");
    }
}

Firstly you've got your indexes into a backwards - a[i][j] means the i-th string and the j-th character, but since you're incrementing j for each word you want it the other way around - a[j][i].

Secondly you can't use i for both indexing into s and a. Think about what happens when you are building the second string. In your example input the second word starts when i is 4 so the first character will be stored as a[1][4]=s[4] which leaves a[1][0] to a[1][3] uninitialised. So you have to use a 3rd variable to track where you are in the other string.

When you hit a space, you don't want to add it to your word as it won't match later on. You also need to add in a null-terminator character to the end of each string or else your code won't know where the end of the string is.

Putting the above together gives you something like this:

int i,j,k;
k = j = 0;
for(i=0;i<strlen(s);i++)
{
    if(s[i] == ' ')
    {
        a[j][k] = '\0';
        j++;
        k=0;
        printf("\n");
    }
    else
    {
        a[j][k] = s[i];
        printf("%c",a[j][k]);
        k++;
    }
}
a[j][k]='\0';

Upvotes: 1

NikosD
NikosD

Reputation: 125

The problem is that a is a two-dimentional array and you reference it as a one dimention. Maby you use a 2-dimentional array to represent i=line, j=character. If you keep this idea then you'll have to do this:

j=0;
for(i=0;i<k;i++)
{
    if(strcmp(a[i][j],nword) == 0)
        count++;
    j++;
}

But then it will be difficult to detect words that are split in half. I'd recommend keeping a as a one dimentional array. Copy the contents of s[i] serially and when you want to distinguish lines use the \r\n operator.

Upvotes: 0

Related Questions