PEREZje
PEREZje

Reputation: 2492

Parsing mathematical expression string

So this is my issue, I have to create a program that implements the shunting yard algorithm in C. For this I first need to parse the given mathematical expression into it's component parts. So I figured a simple for loop over a given string would do the trick.

char *math_exp = "2 + 4";
for (int i = 0; (unsigned) i < strlen(math_expr); i++) {
    printf("%c", math_expr[i]);
}

> 2 + 4

But this quickly runs into an issue when a string contains a number with more digits than 1.

for (int i = 0; (unsigned) i < strlen(math_expr); i++) {
    // skip if char is a space
    if ((int) math_expr[i] == 32) {
        continue;
    }
    printf("%c ", math_expr[i]);
}
> 2 2 + 4

The 22 is now treated as two separate numbers. So I tried to loop over all tokens in the string by splitting the string using the delimiter SPACE.

char math_expr[] = "22 + 4";
char *token = strtok(math_expr, " ");

while (token != NULL) {
    printf("%s ", token);
    token = strtok(NULL, " ");
}
> 22 + 4

This seemed to work but quickly ran into the issue of what to do when there's a bracket in the expression. e.g.:

char math_expr[] = "(22 + 2)";
char *token = strtok(math_expr, " ");

while (token != NULL) {
    printf("%s\n", token);
    token = strtok(NULL, " ");
}
> (22
> +
> 2)

And now I'm stuck, is there a way to circumvent this issue? I need to be able to extract every operator and number (with all possible digits) furthermore I also need to be able to distinguish brackets from numbers they are attached to. I hoped there was some easy way to do this. Any help is appreaciated.

Upvotes: 1

Views: 3018

Answers (1)

Cătălin Fr&#226;ncu
Cătălin Fr&#226;ncu

Reputation: 1199

Tokenization is the first step towards syntactic analysis. As such, it is probably a good idea to encapsulate it into a layer of abstraction, a function that yields one token at a time. This function also discards white space and combines consecutive digits into numbers. It is hard to delegate these steps to strtok(). Finally, the function may rewrap single characters into more structured information.

#include <ctype.h>
#include <stdio.h>

#define T_NUMBER 0
#define T_OPERATOR 1
#define T_BRACKET 2

typedef struct {
  int type;
  int value;
} token;

/* Reads the next token from stdin. Returns 1 on success, 0 on EOL. */
int next_token(token *t) {
  char c;

  /* discard spaces silently */
  do {
    c = getchar();
  } while (c == ' ');

  if (isdigit(c)) {
    t->type = T_NUMBER;
    t->value = 0;
    do {
      t->value = t->value * 10 + (c - '0');
      c = getchar();
    } while (isdigit(c));
    ungetc(c, stdin); /* save the non-digit for next time */

  } else if (c == '+' || c == '-' || c == '*' || c == '/') {
    t->type = T_OPERATOR;
    t->value = c;

  } else if (c == '(' || c == ')') {
    t->type = T_BRACKET;
    t->value = c;

  } else if (c == '\n') {
    ungetc(c, stdin); /* make sure we always return 0 from now on */
    return 0;

  } else {
    /* handle illegal character */

  }

  return 1;
}

int main() {

  token t;

  while (next_token(&t)) {
    switch (t.type) {
      case T_NUMBER:   printf("number   %d\n", t.value); break;
      case T_OPERATOR: printf("operator %c\n", t.value); break;
      case T_BRACKET:  printf("bracket  %c\n", t.value); break;
    }
  }

  return 0;
}

Sample run:

(22+   37 * ( 1534-9)) + 18
bracket  (
number   22
operator +
number   37
operator *
bracket  (
number   1534
operator -
number   9
bracket  )
bracket  )
operator +
number   18

Upvotes: 3

Related Questions