Reputation: 81
Hi I am very new to C and I am having an issue where I am confused on how to parse a string line to a integer. The way I have it so far is just to parse the first string into integer. so If my input is 10 20 30
it will only take the first string and parse it to integer. I am looking for a idea on how to come up with a solution that can read all of the line and parse it all to integer values using getline()
.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *line = NULL;
size_t len = 0;
int val =0;
int sum = 0;
while (getline(&line, &len, stdin) != EOF) {
printf("Line input : %s\n", line);
//printf("Test %d", val);
//parse char into integer
val = atoi(line);
printf("Parsed integer: %d\n", val);
}
free(line);
return 0;
}
Upvotes: 2
Views: 1352
Reputation: 2165
There is tradeoff between correctness and comprehensiveness. Therefore I created two versions of program:
Sequence of integers contained in C-string may be parsed invoking in loop standard C function strtol from <stdlib.h>:
long int strtol (const char* str, char** endptr, int base);
that parses the C-string str interpreting its content as an integral number of the specified base. strtol skips white spaces, interprets integer and set pointer *endptr to the first character following the integer.
Since author has variable sum in his code let's demonstrate parsing of sequence of integers as summation of this sequence. I took function sum_ints_from_string() from GNU Manual 20.11.1 Parsing of Integers Code in manual assumes positive scenario. Therefore I changed it for first version.
#include <assert.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
sum_ints_from_string (char *string)
{
int sum = 0;
while (1) {
char *tail;
int next;
/* Skip whitespace by hand, to detect the end. */
while (string && isspace (*string)) string++;
if (!string || *string == 0)
break;
/* There is more nonwhitespace, */
/* so it ought to be another number. */
errno = 0;
/* Parse it. */
next = strtol (string, &tail, 0);
/* Add it in, if possible. */
if (string == tail)
{
while (tail && !isspace (*tail)) tail++;
printf("error: %s\n", strerror(errno));
printf ("does not have the expected form: %s\n", string);
}
else if(errno == 0)
{
printf("%d\n", next);
sum += next;
}
else
{
printf("error: %s\n", strerror(errno));
printf ("error: %s\n", string);
}
/* Advance past it. */
string = tail;
}
return sum;
}
int main ()
{
int sum = 0;
size_t len = 0;
char * line;
FILE *f = fopen("file.txt", "w+");
assert(f != NULL && "Error opening file");
const char *text = "010 0x10 -10 1111111111111111111111111111 0 30 A 10 +5 + 10 30\n"
"20 20B 6 ABC - 20 10 0";
assert(fputs(text, f) > 0 && "error writing to file");
rewind(f);
errno = 0;
while (getline(&line, &len, f) != -1)
{
sum += sum_ints_from_string(line);
printf("%d\n", sum);
free(line);
line = NULL;
len = 0;
}
assert(sum == 175);
return 0;
}
Second version - positive scenario:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
sum_ints_from_string (char *string)
{
int sum = 0;
while (1) {
char *tail;
int next;
/* Skip whitespace by hand, to detect the end. */
while (isspace (*string)) string++;
if (*string == 0)
break;
/* There is more nonwhitespace, */
/* so it ought to be another number. */
errno = 0;
/* Parse it. */
next = strtol (string, &tail, 0);
/* Add it in, if not overflow. */
if (errno) // returned value is not tested in GNU original
printf ("Overflow\n");
else
sum += next;
/* Advance past it. */
string = tail;
}
return sum;
}
int main ()
{
int sum = 0;
size_t len = 0;
char * line;
while (getline(&line, &len, stdin) != -1)
{
sum += sum_ints_from_string(line);
/*
` If line is set to NULL and len is set 0 before the call, then
getline() will allocate a buffer for storing the line. This buffer
should be freed by the user program even if getline() failed.
*/
free(line);
line = NULL;
len = 0;
}
return 0;
}
Error checking in version from GNU manual is almost skipped. According to CppReference.com strtol:
Return value
So for our purpose of summation we are interested only: whether we can add next val or not - we don't need granular and complex error checking here. We have nothing for summation and print error in case: of out of range OR strtol returns 0 (zero return value means: integer equals 0 or conversion cannot be performed). Otherwise we add next.
Upvotes: 0
Reputation: 144780
As you noticed, atoi()
can only be used to parse the first value on the line read by getline()
, and it has other shortcomings too: if the string does not convert to an integer, the return value will be 0
, which is indistinguishable from the case where the string starts with a valid representation of 0
.
There are more elaborate functions in <stdlib.h>
to convert integers from their representation in different bases (from 2 to 36), detect conversion errors and provide a pointer to the rest of the string: strtol
, strtoul
, strtoll
, strtoull
etc.
As noted in comments, getline()
is specified as returning the number of bytes read from the file or -1
on error. Do not compare to EOF
.
Here is a modified version of your code using the function strtol()
:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *line = NULL;
size_t len = 0;
while (getline(&line, &len, stdin) >= 0) {
char *p, *end;
long sum = 0;
printf("Line input: %s\n", line);
printf("Parsed integers:");
for (p = line; *p != '\0'; p = end) {
long val = strtol(p, &end, 10);
if (end == p)
break;
printf(" %ld", val);
sum += val;
}
printf("\nSum: %ld\n", sum);
/* check if loop stopped on conversion error or end of string */
p += strspn(p, " \t\r\n"); /* skip white space */
if (*p) {
printf("Invalid input: %s", p);
}
}
free(line);
return 0;
}
Notes:
getline
is not part of the C Standard, it is a POSIX extension, it might not be available on all systems or might have different semantics.strtol()
performs range checking: if the converted value exceeds the range of type long
, the value returned is either LONG_MIN
or LONG_MAX
depending on the direction of the overflow and errno
is set to ERANGE
.sum += val;
can also cause an arithmetic overflow.Upvotes: 0
Reputation: 754110
As I noted in comments, it is probably best to use strtol()
(or one of the other members of the strtoX()
family of functions) to convert the string to integers. Here is code that pays attention to the Correct usage of strtol()
.
#include <errno.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *line = NULL;
size_t len = 0;
while (getline(&line, &len, stdin) != -1)
{
printf("Line input : [%s]\n", line);
int val = atoi(line);
printf("Parsed integer: %d\n", val);
char *start = line;
char *eon;
long value;
errno = 0;
while ((value = strtol(start, &eon, 0)),
eon != start &&
!((errno == EINVAL && value == 0) ||
(errno == ERANGE && (value == LONG_MIN || value == LONG_MAX))))
{
printf("%ld\n", value);
start = eon;
errno = 0;
}
putchar('\n');
}
free(line);
return 0;
}
The code in the question to read lines using POSIX getline()
is almost correct; it is legitimate to pass a pointer to a null pointer to the function, and to pass a pointer to 0. However, technically, getline()
returns -1
rather than EOF
, though there are very few (if any) systems where there is a difference. Nevertheless, standard C allows EOF
to be any negative value — it is not required to be -1
.
For the extreme nitpickers, although the Linux and macOS man pages for strtol()
state "returns 0 and sets errno
to EINVAL
" when it fails to convert the string, the C standard doesn't require errno
is set for that. However, when the conversion fails, eon
will be set to start
— that is guaranteed by the standard. So, there is room to argue that the part of the test for EINVAL
is superfluous.
The while
loop uses a comma operator to call strtol()
for its side-effects (assigning to value
and eon
), and ignores the result — and ignoring it is necessary because all possible return values are valid. The other three lines of the condition (the RHS of the comma operator) evaluate whether the conversion was successful. This avoids writing the call to strtol()
twice. It's possibly an extreme case of DRY (don't repeat yourself) programming.
Small sample of running the code (program name rn89
):
$ rn89
1 2 4 5 5 6
Line input : [ 1 2 4 5 5 6
]
Parsed integer: 1
1
2
4
5
5
6
232443 432435423 12312 1232413r2
Line input : [232443 432435423 12312 1232413r2
]
Parsed integer: 232443
232443
432435423
12312
1232413
324d
Line input : [324d
]
Parsed integer: 324
324
$
Upvotes: 2