Windings-Lab
Windings-Lab

Reputation: 101

Have some bugs when implementing my own Atoi()

I can't understand. While my function returning, from char in main, random number. Original atoi() returning -1. I'm currently using C11 version. I heard from someone, that's because of int overflow and i need return int from my function, but i'm currently returning long. How can i detect intOverflow if that's not a 2147483647

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

bool mx_isdigit(int c) {
    return c >= 48 && c <= 57;
}


bool mx_isspace(char c) {
    return (c >= 9 && c <= 13) || c == 32;
}


int mx_atoi(const char *str) {
    long num = 0;
    int sign = 1;

    for (; mx_isspace(*str); str++);

    if (*str == '-' || *str == '+') {
        sign = *str == '-' ? -sign : sign;
        str++;
    }

    for (; *str; str++) {
        if (!mx_isdigit(*str)) {
            break;
        }
        num = (num * 10) + (*str - '0');
    }
    return sign == -1 ? -num : 0 + num;
}

int main(void) {

    char str[100] = "12327123061232712306";
    printf("R: %d\n", atoi(str));
    printf("M: %d", mx_atoi(str));
}

Upvotes: 0

Views: 332

Answers (4)

KamilCuk
KamilCuk

Reputation: 140970

c >= 48 && c <= 57

Do not use magic numbers in the code. Instead of 48 use '0' which is way more readable and provides what intention your do.

How can i detect intOverflow

Overflow happens when the result is greater then the maximum a type can represent. So having numbers a and b we can write:

a + b > MAX

But such condition could not be checked, because a + b... will overflow. But if we flip the expression:

b > MAX - a

Can be easily checked with a simple if. MAX is the maximum value for a type, for int that is INT_MAX from limits.h.

int mx_atoi(const char *str) {    
    for (; mx_isspace(*str); str++);

    bool negative = false;
    if (*str == '-' || *str == '+') {
        negative = *str == '-';
        str++;
    }

    int num = 0;
    for (; mx_isdigit(*str); str++) {
        if (INT_MAX / 10 < num) {
            goto ERR_OVERFLOW;
        }
        num *= 10;
        const unsigned char c = *str - '0';
        if (INT_MAX - c < num) {
            goto ERR_OVERFLOW;
        }
        num += c;

    }
    return negative ? -num : num;
    ERR_OVERFLOW:
    return negative ? INT_MIN : INT_MAX;
}

Upvotes: 1

Windings-Lab
Windings-Lab

Reputation: 101

This is the easiest way, that i guessed. atoi() original using LLONG_MAX check instead of LONG_MAX or INT_MAX. So, experimenting with those limits i discovered. That if (num * 10) + (*str - '0') will reach over the limit of long long type, it will transform number to negative value of LLONG_MIN. So, i have created if statement, that check if next calculation will be less than previous. And if it's true, returning 0 or -1.

#include <stdio.h>
#include <stdbool.h>
#include <string.h>
#include <stdlib.h>

bool mx_isdigit(int c);
bool mx_isspace(char c);

int mx_atoi(const char* str) {
    long long num = 0;
    int sign = 1;

    for (; mx_isspace(*str); str++);

    if (*str == '-' || *str == '+') {
        sign = *str == '-' ? -sign : sign;
        str++;
    }

    for (; *str; str++) {
        if (!mx_isdigit(*str)) {
            break;
        }
      if ((num * 10) + (*str - '0') < num) {
          return sign == -1 ? 0 : -1;
      }
        num = (num * 10) + (*str - '0');
    }

    return sign == -1 ? -num : num;
}

int main(void) {

    char str[100] = "-9223372036854775809";
    printf("R: %d\n", atoi(str));
    printf("M: %d\n", mx_atoi(str));
}

Upvotes: 0

chux
chux

Reputation: 153368

int overflow potential

num = (num * 10) + (*str - '0'); encounters int overflow, which is undefined behavior (UB) when:

1) input string should represent INT_MIN and int/long have the same range OR
2) input string encodes a value outside the int range.

Various ways to avoid that.

Does not detect a string of no digits

Returning 0 in that case is reasonable, yet code may want to set some error condition.

Does not complain about trailing non-digits

Simply ignoring trailing characters is reasonable, yet code may want to set some error condition.


A way to avoid int overflow (and not rely on long wider than int) is to test before (num * 10) + (*str - '0') and since there is more negative ints than positive ones, accumulate on the negative side.

bool digit_found = false;
int val = 0;
for (; mx_isdigit(*str); str++) {
    digit_found = true;
    int digit = *str - '\0';
    if (val <= INT_MIN/10 && (val < INT_MIN/10 || digit > -(INT_MIN%10))) { // C99
      return sign == 1 ? INT_MAX : INT_MIN;
    }
    val = val * 10 - digit;  // note subtraction here
}

if (!digit_found) {
    return 0; // Or handle in some other fashion
}

if (sign == 1) {
  // If val is too negative to negate ...
  if (val < -INT_MAX) {
    return INT_MAX;  // overflow
  }
  return -val;
}
return val;

Upvotes: 0

Stephan Lechner
Stephan Lechner

Reputation: 35154

Inside your function int mx_atoi(const char *str) {..., you are calculating a result of type long, yet the function returns an int; so if the result stored in num of type long does not fit in an int, something will get lost (actually , since signed integral values are converted, the behaviour is "implementation-defined", i.e. compiler-dependant). The result could be truncated bitwise, yielding a number that "looks" rather different that the decimal number you entered. Cf., for example, this online C11 draft. The bold paragraph applies:

6.3.1.3 Signed and unsigned integers

1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.60)

3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Make int mx_atoi(const char *str) to long mx_atoi(const char *str), use a long-variable to store the result, and don't forget to use format specifier %ld instead of %d in your printf then.

Otherwise, if you need to stick to int and you want to safely react on overflows, you could do something like

if (num > INT_MAX) {
  return -1;
}

inside your loop. INT_MAX is defined in limits.h

Upvotes: 1

Related Questions