Reputation: 95
I'm tinkering around with flex and bison to create a small calculator program. The token will be something like this:
read A
read B
sum := A + B
write sum
Read, write will be keyword indicating reading a value in or writing a value to the output. ":=" is the assignment operator. A,B are identifiers, which can be strings. There will also be comment //comment
and block comment /* asdfsd */
Would these regular expression be correct to specify the little grammar I specify?
[:][=] //assignment operator
[ \t] //skipping whitespace
[a-zA-Z0-9]+ //identifiers
[Rr][Ee][Aa][Dd] //read symbols, not case-sensitive
[/][/] `//comment`
For the assignment operator and the comment regex, can I just do this instead? would flex and bison accept it?
":=" //assignment operator
"//" //comment
Upvotes: 2
Views: 1164
Reputation: 370162
Yes, ":=" and "//" will work, though the comment rule should really be "//".*
because you want to skip everything after the // (until the end of line). If you just match "//", flex will try to tokenize what comes after it, which you don't want because a comment doesn't have to consist of valid tokens (and even if it did, those tokens should be seen by the parser).
Further [Rr][Ee][Aa][Dd]
should be placed before the identifier rule. Otherwise it will never be matched (because if two rules can match the same lexeme, flex will pick the one that comes first in the file). It can also be written more succinctly as (?i:read)
or you can enable case insensitivity globally with %option caseless
and just write read
.
Upvotes: 2
Reputation: 29431
You can start with (with ignore case option):
(read|write)\s+[a-z]+
will match read/write expression;[a-z]+\s:=[a-z+\/* -]*
will match assignation with simple calculus;\/\/.*
will match an inline comment;\/\*[\s\S]*\*\/
will match multi-lines comments.Keep in mind that theses are basic regex and may not fit for too complex syntaxes.
You can try it with Regex101.com for example
Upvotes: 0