Hans De Schryver
Hans De Schryver

Reputation: 43

RegEx that matching an expression NOT at the beginning of a line

I've been pounding my head on an expression for over an hour, without results. So it's time to ask for help.

In the following (multi-line) text:
Waltzes vol 15
Waltzes vol. 15
Waltzes vol. A
Waltzes, volume 15
volume 15: waltzes

The portions in bold are the matches of the RegEx I came up with thus far:
(?!^),*\s*(?:vol[ume]*\.*)\s*(?=[0-9A-Z]+)

All are correct, except the last one, which should not be included because it is at the beginning of a line.
As far I can tell from the docs at http://www.regular-expressions.info/refadv.html, the (?!^) look-around part in the expression should exclude matches found by ,*\s*(?:vol[ume]*\.*)\s*(?=[0-9A-Z]+) at the beginning of a line, but that doesn't seem to work.

On the other hand, the expression (?!^)op[us]*\.*\s*(?=[0-9]+) works correctly and does not return a match in the last line of the following text:
Waltzes op. 15
Waltzes opus 15
opus 15: waltzes

What am I doing wrong with the first expression?

Upvotes: 1

Views: 111

Answers (3)

Ravi K Thapliyal
Ravi K Thapliyal

Reputation: 51711

Here's why your regex isn't working as expected

  • The negative lookbehind is missing <. It should be (?<!^)
  • The lookbehind should precede (?:vol[ume]*\.*) immediately
  • You need to enable multi-line (?m) (without which ^ would only match start of input)

So, your regex with these corrections becomes

(?m),*\s*(?<!^)(?:vol[ume]*\.*)\s*(?=[0-9A-Z]+)

The above works but can be further improved. The use of [ume]* would also let matches like voleee, volmeu etc. Instead of being unbounded with *, , and . can be made optional with ?.

(?m),?\s*(?<!^)(?:vol\.?|volume)\s*(?=[0-9A-Z]+)

Upvotes: 1

go-oleg
go-oleg

Reputation: 19480

If you are trying to match vol/vol./volume that is not at the beginning of a line, the following should work:

^.+(vol\.?|volume)

^.+ means match 1 or more characters from the beginning of the line

(vol\.?|volume) means match vol followed by an optional . or match volume

Upvotes: 1

Richard Sitze
Richard Sitze

Reputation: 8463

Go with it, instead of fighting it:

^.+\s*(?:vol[ume]*\.*)\s*(?=[0-9A-Z]+)

Force a match at the beginning of the line (^), followed by one or more characters...

Upvotes: 0

Related Questions