user133688
user133688

Reputation: 7064

What's the difference between these regexes

I'm reading Ionic's source code. I came across this regex, and i"m pretty baffled by it.

([\s\S]+?)

Ok, it's grouping on every char that is either a white space, or non white space???

Why didn't they just do

(.+?)

Am I missing something?

Upvotes: 11

Views: 177

Answers (5)

asdf
asdf

Reputation: 3067

A . matches everything but the newline character. This is actually a well known/documented problem with javascript. The \s (whitespace match) alongside it's negation \S (non-whitespace match) provides a dotall match including the newline. Thus [\s\S] is generally used more frequently than .

Upvotes: 3

deme72
deme72

Reputation: 1153

. matches any char except carriage return /r and new line /n

The Shortest way to do [/s/S](white space and non white space) is [^](not nothing)

Upvotes: 1

theUtherSide
theUtherSide

Reputation: 3476

The RegEx they used includes more characters (essentially everything).

\s matches any word or digit character or whitespace.

\S matches anything except a digit, word character, or whitespace

As Casimir notes:

. matches any character except newline (\n)

Upvotes: 1

Walking.In.The.Air
Walking.In.The.Air

Reputation: 752

In many realizations of Regexp "." doesn't match new lines. So they use "[\s\S]" as a little hack =)

Upvotes: 3

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626758

The . matches any symbol but a newline. In order to make it match a newline, in most languages there is a modifier (dotall, singleline). However, in JS, there is no such a modifier.

Thus, a work-around is to use a [\s\S] character class that will match any character, including a newline, because \s will match all whitespace and \S will match all non-whitespace characters. Similarly, one could use [\d\D] or [\w\W].

Also, there is a [^] pattern to match the same thing in JS, but since it is JavaScript-specific, the regexes containing this pattern are not portable between regex flavors.

The +? lazy quanitifier matches 1 or more symbols conforming to the preceding subpattern, but as few as possible. Thus, it will match just 1 symbol if used like this, at the end of the pattern.

Upvotes: 11

Related Questions