suhao399
suhao399

Reputation: 648

Ruby Regexp character class with new line, why not match?

I want to use this regex to match any block comment (c-style) in a string. But why the below does not?

rblockcmt = Regexp.new "/\\*[.\s]*?\\*/"  # match block comment
p rblockcmt=~"/* 22/Nov - add fee update */"

==> nil 

Upvotes: 0

Views: 867

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110675

It appears that you intend [.\s]*? to match any character or a whitespace, zero or more times, lazily. Firstly, whitespaces are characters, so you don't need \s. That simplifies your expression to [.]*?. Secondly, if your intent is to match any character there is no need for a character class, just write .. Thirdly, and most importantly, a period within a character class is simply the character ".".

You want .*? (or [^*]*).

Upvotes: 1

7stud
7stud

Reputation: 48599

And in addition to what Sir Swoveland posted, a . matches any character except a newline:

The following metacharacters also behave like character classes:

/./ - Any character except a newline.

https://ruby-doc.org/core-2.3.0/Regexp.html

If you need . to match a newline, you can specify the m flag, e.g. /.*?/m

Options

The end delimiter for a regexp can be followed by one or more single-letter options which control how the pattern can match.

/pat/i - Ignore case
/pat/m - Treat a newline as a character matched by .
...

https://ruby-doc.org/core-2.3.0/Regexp.html

Because having exceptions/quirks like newline not matching a . can be painful, some people specify the m option for every regex they write.

Upvotes: 3

Related Questions