Reputation: 14283
I'm trying to write a regex that matches a url only if after '/'
there's a dot.
Here's what i've got so far: http://regexr.com/3cu85
my regex is the following: /facebook.com\/.*[.]/gm
and i'm testing with this URls:
facebook.com
facebook.com/
facebook.com/test.user
www.facebook.com
www.facebook.com/
www.facebook.com/test.user
https://www.facebook.com
https://www.facebook.com/
https://www.facebook.com/test.user
The problem is that I need to match the full url, and as you can it starts from the word "facebook".
I tried different options, but none worked for me.
Thanks for any help
Upvotes: 1
Views: 281
Reputation: 626835
My suggestion is
(https?:\/\/)?(w{3}\.)?facebook\.com\/[^\/]*\..*
See the regex demo (the \n
is added to the negated character class [^\/]
so as to match the URLs on separate lines only, if you test individual strings, the \n
is not necessary.)
This regex matches:
(https?:\/\/)?
- optional (one or zero) occurrence of http://
or https://
(w{3}\.)?
- optional (one or zero) occurrence of www
facebook\.com
- literal sequence facebook.com
\/
- a literal /
[^\/]*
- zero or more characters other than /
(BETTER: use [^\/.]*
to match any char but a .
and /
to avoid redundant backtracking)\.
- a literal .
.*
- any 0+ characters but a newline (BETTER: since the URL cannot have a space (usually), you can replace it with \S*
matching zero or more non-whitespace characters).So, a better alternative:
(https?:\/\/)?(w{3}\.)?\bfacebook\.com\/[^\/.]*\.\S*
Upvotes: 1