Reputation: 55
I need a rather greedy regex that agressively matches strings that does not begin with any protocol such as "http://" or "ftp://" and at the same time doesn't match strings that begin with a "www" (or both combined, of course). I'm fairly new to Java and regex but I've managed to make up this one (that doesn't work for me):
([\w'-]+)\.(com|info|net|org).+
However it doesn't seem to match "example.com". It does seem match "example.com/index.php?q=somequery#something". I don't really understand how to create a regex that doesn't give a match if the string begins with a series of characters, in my case "www" or "http://".
Any help is appreciated.
(P.S I've tried to look for dupes to this question, I however couldn't find one that matches this one perfectly. Very sorry if this is a dupe.)
Upvotes: 0
Views: 228
Reputation: 39395
Your regex has .+
at the end. Which means any character except \n (1 or more times)
.
But your sample example.com
doesn't have anything after the .com
. That's why your regex doesn't match with the sample.
replace the .+
with .*
and it will work for you. FYI the .*
means any character except \n (0 or more times)
Upvotes: 1