MacUsers
MacUsers

Reputation: 2229

Regex to match blank line and comments in a file

How do I ignore comments or blank/empty lines in a file when reading? I thought /^[\s#]*$/ would do the job but it didn't:

irb(main):180:0> open(inFile, 'r').each { |ln| puts ln if ln !~ /^[\s#]*$/ }
....
....
# and ..... ThIs Is A cOmMeNt .....
....
....
=> #<File:/tmp/running-instances.txt>
irb(main):181:0> 

What am I missing here? Any help would be highly appreciated. Cheers!!

PS.

I can do the separately in two steps though:

open(inFile, 'r').each { |ln| next if ln =~ /^\s*$/; puts ln if ln !~ /#[^#]*$/ }

Upvotes: 2

Views: 6582

Answers (8)

Kris
Kris

Reputation: 4823

Especially in Java worked for me

(^((\s)#+.)$)|(^(\s)*$)

It considers arbitrary whitespaces prior an '#' and empty lines with just whitespaces. So

string.matches("(^\s*#+.$)|(^\s$)")

returns true for all comment lines and empty lines.

Upvotes: 0

Todd A. Jacobs
Todd A. Jacobs

Reputation: 84413

Match Comments and End-of-Line

/
  ^      # match start of line
  \s*    # match zero or more spaces
  (\#|$) # match comment symbol or end-of-line
/x

Compressed, the regex looks like this:

/^\s*(#|$)/

Prose Explanation

The \s* means that any amount of whitespace immediately after the start of line, including none at all, can match. (\#|$) uses alternation, so either of the patterns within the parentheses can match. NB: The backslash is only needed to escape the comment symbol when using the x option, which ignores whitespace and comments in the regular expression; if you aren't using x then leave the backslash out.

The pattern will therefore match start-of-line followed by optional whitespace, which must then be immediately followed by either a comment symbol or an end-of-line. Because the match is anchored, it will not match strings like "foo # bar" or " Array#string\n" because they won't match the required pattern.

Upvotes: 11

Gene
Gene

Reputation: 46990

To get back to the character class approach you were seeking...

There is no need to match the whole line. You need only the part up to the first non-space.

So the idea is to match a leading prefix of spaces that ends in something other than a comment character. Because the matcher will backtrack, you must also disallow the final character in the match from being whitespace.

open('foo.txt', 'r').each { |ln| ln.chomp!; puts ln if ln =~ /\A\s*[^#\s]/ }

I am assuming you want to allow leading spaces before comment characters. Don't forget to chomp the newline to get an accurate replay of the file.

Upvotes: 0

Max
Max

Reputation: 22375

Simpler than the other answers, I think.

/^\s*(#.*)?$/

Upvotes: 1

Dillon Hafer
Dillon Hafer

Reputation: 156

open(inFile, 'r').each { |ln|  puts ln if ln !~ /^(\s+|#.+)$/ }

This regex looks for any amount of whitespace characters until the end of the line or a hash symbol followed by any characters until the end of the line. I believe [\s#]* looks for zero or more of whitespace characters or hash symbols, where adding a . will look for any character after the hash symbol as a match.

Upvotes: 0

Toto
Toto

Reputation: 91498

How about this regex:

/^(#.*|\s*)$/

Upvotes: 2

Brigand
Brigand

Reputation: 86260

I believe this is what you wanted to do.

^( *#.*| *?)$

The reason for the space before the # is, the comment could be indented a space or several. If the line isn't a comment, we soak up as many spaces as we can, and see if that's all there is.

The space could be written as [ ] for clarity.

^([ ]*#.*|[ ]*?)$

Or to include tabs:

^([ \t]*#.*|[ \t]*?)$

rubular (the blue stuff won't be matched)

Upvotes: 1

zs2020
zs2020

Reputation: 54542

I think you should do /^[\s#].*$/, use . to match any characters.

Upvotes: 0

Related Questions