Reputation: 480
I want to create a regular expresion which match with an URL except URl containing "youtube".
I have done this regexp which match with all URL:
(www+\.)?[\w-]+s{0,3}[/\.,;:!]{1,3}\s{0,3}(r[o0]|n[e3]t|lt|c[o0]m|[i!]nf[o0]|[o0]rg|b[i!][z2]|ru|[e3]du)(\/)?
But i want to add in the regular expresion, to not match if the URL containing 'youtube'.
We have a big system wich filter sentences that we received, and for each sentences received, we apply several regular expressions for example. And i want a regular expresion which say, this sentence containing an URL but not containing "youtube".
Is it possible?
Thanks
Upvotes: 1
Views: 870
Reputation: 43023
I would do this :
(www+\.)?(?!youtube)([\w-]+s{0,3})[/\.,;:!]{1,3}\s{0,3}(r[o0]|n[e3]t|lt|c[o0]m|[i!]nf[o0]|[o0]rg|b[i!][z2]|ru|[e3]du)(\/)?
youtube.com => No Match
test.n3t => Match
wwwwwww.coucous::.3du => Match
utube;;; r0 => Match
Upvotes: 4
Reputation: 770
here is the good example of "Regex match all url except youtube ones"
https://stackoverflow.com/a/6681321/2413470
(?!\S+youtube\.com)((?<!\S)(((f|ht){1}tp[s]?:\/\/|(?<!\S)www\.)[-a-zA-Z0-9@:%_\+.~#?&\/\/=]+))
if this Regex is not usefull for you let me know
Upvotes: 0
Reputation: 121720
Don't use a regex for this, use URI
:
final URI uri = new URI(inputString);
// test against this URI's `.getHost()`, or `.getPath()`; whatever is relevant
Imprint this into your head using red iron/nitric acid(1): every time you have to do content checking of a URL or any URI in pure Java, use URI
. Not regexes. URI will parse the thing for you.
Oh, and another thing: unlike URL
, when compared with .equals()
, URI
will not attempt to resolve the hostname. This is no joke. Using URLs as keys into a map, or members of a set, is asking for trouble... Fortunately, URL
has a .toURI()
method.
(1) pick your choice
Upvotes: 1
Reputation: 47290
do you need a regex, assuming yourUrl is a string ...
!(yourUrl.contains("youtube"))
Upvotes: 0
Reputation: 9
A similar exclusion is mentioned here (regex match urls NOT containing given set of strings) - just change your regex to have the negative look-ahead
Upvotes: 0