204
204

Reputation: 473

counting the number of strings in a text file containing numbers as well

I wanted to only count the number of strings in a text file, containing numbers as well. But the code below, counts even the numbers in the file as strings. How do I rectify the problem?

int count;
char *temp;
FILE *fp;

 fp = fopen("multiplexyz.txt" ,"r" );

 while(fscanf(fp,"%s",temp) != EOF )
 {
     count++;
 }

 printf("%d ",count);
 return 0;

}

Upvotes: 1

Views: 975

Answers (3)

paxdiablo
paxdiablo

Reputation: 881553

Well, first up, using the temp pointer without having backing storage for it is going to cause you a world of pain.

I'd suggest, as a start, using something like char temp[1000] instead, keeping in mind that's still a bit risky if you have words more than a thousand or so characters long (that's a different issue to the one you're asking about so I'll mention it but not spend too much time on fixing it).

Secondly, it appears you want to count words with numbers (like alpha7 or pi/2). If that's the case, you simply need to check temp after reading the "word" and increment count only if it matches a "non-numeric" pattern.

That could be as simple as just not incrementing if the word consists only of digits, or it could be complicated if you want to handle decimals, exponential formats and so on.

But the bottom line remains the same:

while(fscanf(fp,"%s",temp) != EOF )
{
    if (! isANumber(temp))
        count++;
}

with a suitable definition of isANumber. For example, for unsigned integers only, something like this would be a good start:

int isANumber (char *str) {
    // Empty string is not a number.

    if (*str == '\0')
        return 0;

    // Check every character.

    while (*str != '\0') {
        // If non-digit, it's not a number.

        if (! isdigit (*str))
            return 0;
        str++;
    }

    // If all characters were digits, it was a number.

    return 1;
}

For more complex checking, you can use the strto* calls in C, giving them the temp buffer and ensuring you use the endptr method to ensure the entire string is scanned. Off the top of my head, so not well tested, that would go something like:

int isANumber (char *str) {
    // Empty string is not a number.

    if (*str == '\0')
        return 0;

    // Use strtod to get a double.

    char *endPtr;
    long double d = strtold (str, &endPtr);

    // Characters unconsumed, not number (things like 42b).

    if (*endPtr != '\0')
        return 0;

    // Was a long double, so number.

    return 1;
}

The only thing you need to watch out for there is that certain strings like NaN or +Inf are considered a number by strtold so you may need extra checks for that.

Upvotes: 1

David C. Rankin
David C. Rankin

Reputation: 84561

I find strpbrk to be one of the most helpful function to search for several needles in a haystack. Your set of needles being the numeric characters "0123456789" which if present in a line read from your file will count as a line. I also prefer POSIX getline for a line count do to its proper handling of files with non-POSIX line endings for the last line (both fgets and wc -l omit text (and a count) of the last line if it does not contain a POSIX line end ('\n'). That said, a small function that searches a line for characters contained in a trm passed as a parameter could be written as:

/** open and read each line in 'fn' returning the number of lines
 *  continaing any of the characters in 'trm'.
 */
size_t nlines (char *fn, char *trm)
{
    if (!fn) return 0;

    size_t lines = 0, n = 0;
    char *buf = NULL;
    FILE *fp = fopen (fn, "r");

    if (!fp) return 0;

    while (getline (&buf, &n, fp) != -1)
        if (strpbrk (buf, trm))
            lines++;

    fclose (fp);
    free (buf);

    return lines;
}

Simply pass the filename of interest and the terms to search for in each line. A short test code with a default term of "0123456789" that takes the filename as the first parameter and the term as the second could be written as follows:

#include <stdio.h>      /* printf */
#include <stdlib.h>     /* free   */
#include <string.h>     /* strlen, strrchr */

size_t nlines (char *fn, char *trm);

int main (int argc, char **argv) {

    char *fn   = argc > 1 ? argv[1] : NULL;
    char *srch = argc > 2 ? argv[2] : "0123456789";
    if (!fn) return 1;

    printf ("%zu %s\n", nlines (fn, srch), fn);

    return 0;
}

/** open and read each line in 'fn' returning the number of lines
 *  continaing any of the characters in 'trm'.
 */
size_t nlines (char *fn, char *trm)
{
    if (!fn) return 0;

    size_t lines = 0, n = 0;
    char *buf = NULL;
    FILE *fp = fopen (fn, "r");

    if (!fp) return 0;

    while (getline (&buf, &n, fp) != -1)
        if (strpbrk (buf, trm))
            lines++;

    fclose (fp);
    free (buf);

    return lines;
}

Give it a try and see if this is what you are expecting, if not, just let me know and I am glad to help further.

Example Input File

$ cat dat/linewno.txt
The quick brown fox
jumps over 3 lazy dogs
who sleep in the sun
with a temp of 101

Example Use/Output

$ ./bin/getline_nlines_nums dat/linewno.txt
2 dat/linewno.txt

$ wc -l dat/linewno.txt
4 dat/linewno.txt

Upvotes: 0

Dane
Dane

Reputation: 9552

inside your while loop, loop through the string to check if any of its characters are digits. Something like:

while(*temp != '\0'){
       if(isnumber(*temp))
           break;
}

[dont copy exact same code]

Upvotes: 0

Related Questions