Inventor
Inventor

Reputation: 111

Delete all Bash comments

How can I match and delete all comments from the line? I can delete comments starting from new line, or the ones not in quotes using sed. But my script fails in the following examples

This one "# this is not a comment" # but this "is a comment"

Can sed handle this case? if yes what is the regex?

Example:

Upvotes: 3

Views: 247

Answers (2)

Alberto Coletta
Alberto Coletta

Reputation: 1603

You can use a lexical analyzer like Flex directly applied to the script. In its manual you can find "How can I match C-style comments?" and I think that you can adapt that part to your problem.

If you need an in-depth tutorial, you can find it here; under "Lexical Analysis" section you can find a pdf that introduce you to the tool and an archive with some practical examples, including "c99-comment-eater", which you can draw inspiration from.

Upvotes: 1

QWERTY21KG
QWERTY21KG

Reputation: 46

If we assume that # is not a comment when it is in quotes or escaped with backslash, then we can define the following regex:

(ES|RT|QT)*C?

where

ES - escape sequence: \ followed by 1 char

\\.

RT - non-special regular text

[^"\\#]*

QT - text in quotes

"[^"]*"

C - comment starting with unescaped, unquoted hash sign # and ending with the end of line

#.*

The possible solution using sed:

sed 's/^\(\(\\.\|[^"\\#]*\|"[^"]*"\)*\)#.*$/\1/'

Upvotes: 1

Related Questions