Reputation: 1845
I am looking for a rule in flex that handles the escaped newlines and gives me a token ignoring that newline.
Eg:
I have a rule in my lex specification like:
\"(\.|[^\"])*\"
to capture all the string literals. This does capture strings from code like:
Printf("This is literal")
but it doesn't give me the correct token if code is like:
printf("This is \
literal.")
What modification I can make to my lex spec to handle this situation?
Upvotes: 0
Views: 381
Reputation: 241721
(F)lex only recognises tokens. Interpreting their contents is up to you.
If you're just recognising a string literal, you can use a regular expression like
["]([^"\n]|\\.)*["]
But if you want the correct interpretation of the string literal -- according to your language -- you'll need a start condition with appropriate actions.
The usual approach is to get initialise a StringBuffer like object when you see the opening "
, and change to the string start condition. Non-special characters are just appended to the StringBuffer; escape sequences like \n
append an appropriate character to the StringBuffer, and \\\n
does nothing. When the close quote iseen, tje token is actually sent along with the accumulated text.
For more detailed examples, see Flex / Lex Encoding Strings with Escaped Characters and Optimizing flex string literal parsing (and probably many many more).
Upvotes: 2