Squeez
Squeez

Reputation: 959

* and ** difference

What is difference between * and **? Why .** is not compiling while using pattern.compile(".**");?

Upvotes: 4

Views: 902

Answers (4)

Pedro Lobito
Pedro Lobito

Reputation: 98861

* inside a regex, is a quantifier, if you want to use it without any special meaning, escape it \*.

.* - will match any single character that is NOT a line break character (line feed, carriage return, next line, line separator, paragraph separator) .* Between zero and unlimited times, as many times as possible, giving back as needed (greedy) * Your regular expression may find zero-length matches.
Java 8 allows a zero-length match at the position where the previous match ends.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

See the Java Quantifiers reference:

Greedy  Reluctant   Possessive  Meaning
X?      X??         X?+         X, once or not at all
X*      X*?         X*+         X, zero or more times
X+      X+?         X++         X, one or more times
X{n}    X{n}?       X{n}+       X, exactly n times
X{n,}   X{n,}?      X{n,}+      X, at least n times
X{n,m}  X{n,m}?     X{n,m}+     X, at least n but not more than m times

There is no ** quantifier. When you use + after +, * or ? (or even {n,m}), you can create a possessive quantifier (see the table above), but adding a * quantifier after a * is considered a user error.

That is why .* would match 0+ characters other than a newline (without the Pattern.DOTALL modifier) and .** throws an exception.

Note that online regex testers also warn you of this problem: Dangling meta character '*' near index 2 .** ^ (same warning appears at OCPSoft regex tester).

Upvotes: 2

shmosel
shmosel

Reputation: 50716

. means "(almost) any character".

* means "match the previous character 0 or more times"

The second * means nothing in this context.

Upvotes: 2

azurefrog
azurefrog

Reputation: 10945

When evaluating regular expressions, * is a metacharacter that means that the preceding character occurs 0 or more times.

When you write .**, that breaks down into .* (which means 0 or more of any character) followed by *, where there is no preceding character, so the pattern can't compile.

Upvotes: 2

Related Questions