Nick
Nick

Reputation: 14283

regex url matching

I'm trying to write a regex that matches a url only if after '/' there's a dot.

Here's what i've got so far: http://regexr.com/3cu85

my regex is the following: /facebook.com\/.*[.]/gm and i'm testing with this URls:

facebook.com
facebook.com/
facebook.com/test.user 

www.facebook.com
www.facebook.com/
www.facebook.com/test.user

https://www.facebook.com
https://www.facebook.com/
https://www.facebook.com/test.user

The problem is that I need to match the full url, and as you can it starts from the word "facebook".

I tried different options, but none worked for me.

Thanks for any help

Upvotes: 1

Views: 281

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626835

My suggestion is

(https?:\/\/)?(w{3}\.)?facebook\.com\/[^\/]*\..*

See the regex demo (the \n is added to the negated character class [^\/] so as to match the URLs on separate lines only, if you test individual strings, the \n is not necessary.)

This regex matches:

  • (https?:\/\/)? - optional (one or zero) occurrence of http:// or https://
  • (w{3}\.)? - optional (one or zero) occurrence of www
  • facebook\.com - literal sequence facebook.com
  • \/ - a literal /
  • [^\/]* - zero or more characters other than / (BETTER: use [^\/.]* to match any char but a . and / to avoid redundant backtracking)
  • \. - a literal .
  • .* - any 0+ characters but a newline (BETTER: since the URL cannot have a space (usually), you can replace it with \S* matching zero or more non-whitespace characters).

So, a better alternative:

(https?:\/\/)?(w{3}\.)?\bfacebook\.com\/[^\/.]*\.\S*

Upvotes: 1

Related Questions