Fred
Fred

Reputation: 420

regex positive lookahead with if/else condition

I am trying to write an regular expression that would check if a pattern exists and, if it does, matches everything following it, and if (and only if) it does not, matches everything after another pattern.

example lines:

http://example.com/contact
www.example.com/contact
http://www.example.com/contact

expected output in all 3 cases: example

Here is the regular expression I expected would do the job:

(?(?<=www\.).+|(?<=http:\/\/).+)(?=\.com)

which I assumed would:

  1. check if "www." is to be found
  2. if yes, would match everything following it
  3. if not, match everything following "http://"
  4. restrict match to everything before the occurrence of ".com "

For the first two lines, the expression worked well, but in the third line www.example is matched instead of just example. Does this mean that for some reason the else command is executed although the if condition is met?

How can I change the above expression so that it only does the http// lookahead if the www. part was not found?

Upvotes: 2

Views: 2496

Answers (1)

anubhava
anubhava

Reputation: 784918

Converting my comment to answer.

You may use this regex:

^(?:https?://(?:www\.)?|www\.)\K\S+?(?=\.com(?:/|$))

RegEx Demo

RegEx Description:

  • ^: Start
  • (?:https?://(?:www\.)?|www\.): Match http://www. or http:// or (https)
  • \K: Reset matched information
  • \S+?: Match 1+ non-space characters (lazy)
  • (?=\.com(?:/|$)): Using lookahead assert that we have .com or end of line ahead

Upvotes: 2

Related Questions