Reputation: 2492
So this is my issue, I have to create a program that implements the shunting yard algorithm in C. For this I first need to parse the given mathematical expression into it's component parts. So I figured a simple for loop over a given string would do the trick.
char *math_exp = "2 + 4";
for (int i = 0; (unsigned) i < strlen(math_expr); i++) {
printf("%c", math_expr[i]);
}
> 2 + 4
But this quickly runs into an issue when a string contains a number with more digits than 1.
for (int i = 0; (unsigned) i < strlen(math_expr); i++) {
// skip if char is a space
if ((int) math_expr[i] == 32) {
continue;
}
printf("%c ", math_expr[i]);
}
> 2 2 + 4
The 22 is now treated as two separate numbers. So I tried to loop over all tokens in the string by splitting the string using the delimiter SPACE.
char math_expr[] = "22 + 4";
char *token = strtok(math_expr, " ");
while (token != NULL) {
printf("%s ", token);
token = strtok(NULL, " ");
}
> 22 + 4
This seemed to work but quickly ran into the issue of what to do when there's a bracket in the expression. e.g.:
char math_expr[] = "(22 + 2)";
char *token = strtok(math_expr, " ");
while (token != NULL) {
printf("%s\n", token);
token = strtok(NULL, " ");
}
> (22
> +
> 2)
And now I'm stuck, is there a way to circumvent this issue? I need to be able to extract every operator and number (with all possible digits) furthermore I also need to be able to distinguish brackets from numbers they are attached to. I hoped there was some easy way to do this. Any help is appreaciated.
Upvotes: 1
Views: 3018
Reputation: 1199
Tokenization is the first step towards syntactic analysis. As such, it is probably a good idea to encapsulate it into a layer of abstraction, a function that yields one token at a time. This function also discards white space and combines consecutive digits into numbers. It is hard to delegate these steps to strtok()
. Finally, the function may rewrap single characters into more structured information.
#include <ctype.h>
#include <stdio.h>
#define T_NUMBER 0
#define T_OPERATOR 1
#define T_BRACKET 2
typedef struct {
int type;
int value;
} token;
/* Reads the next token from stdin. Returns 1 on success, 0 on EOL. */
int next_token(token *t) {
char c;
/* discard spaces silently */
do {
c = getchar();
} while (c == ' ');
if (isdigit(c)) {
t->type = T_NUMBER;
t->value = 0;
do {
t->value = t->value * 10 + (c - '0');
c = getchar();
} while (isdigit(c));
ungetc(c, stdin); /* save the non-digit for next time */
} else if (c == '+' || c == '-' || c == '*' || c == '/') {
t->type = T_OPERATOR;
t->value = c;
} else if (c == '(' || c == ')') {
t->type = T_BRACKET;
t->value = c;
} else if (c == '\n') {
ungetc(c, stdin); /* make sure we always return 0 from now on */
return 0;
} else {
/* handle illegal character */
}
return 1;
}
int main() {
token t;
while (next_token(&t)) {
switch (t.type) {
case T_NUMBER: printf("number %d\n", t.value); break;
case T_OPERATOR: printf("operator %c\n", t.value); break;
case T_BRACKET: printf("bracket %c\n", t.value); break;
}
}
return 0;
}
Sample run:
(22+ 37 * ( 1534-9)) + 18
bracket (
number 22
operator +
number 37
operator *
bracket (
number 1534
operator -
number 9
bracket )
bracket )
operator +
number 18
Upvotes: 3