Mark
Mark

Reputation: 18787

Regex to match all urls, excluding .css, .js recources

I'm looking for a regular expression to exclude the URLs from an extension I don't like.

For example resources ending with: .css, .js, .font, .png, .jpg etc. should be excluded.

However, I can put all resources to the same folder and try to exclude URLs to this folder, like:

.*\/(?!content\/media)\/.*

But that doesn't work! How can I improve this regex to match my criteria?

e.g.

Match:

http://www.myapp.com/xyzOranotherContextRoot/rest/user/get/123?some=par#/other

No match:

http://www.myapp.com/xyzOranotherContextRoot/content/media/css/main.css?7892843

Upvotes: 2

Views: 2144

Answers (2)

Mark
Mark

Reputation: 18787

The correct solution is:

^((?!\/content\/media\/).)*$

see: https://regex101.com/r/bD0iD9/4

Inspirit by Regular expression to match a line that doesn't contain a word?

Upvotes: 1

cxw
cxw

Reputation: 17041

Two things:

First, the ?! negative lookahead doesn't remove any characters from the input. Add [^\/]+ before the trailing slash. Right now it is trying to match two consecutive slashes. For example:

.*\/(?!content\/media)[^\/]+\/.*

(edit) Second, the .*s at the beginning and end match too much. Try tightening those up, or adding more detail to content\/media. As it stands, content/media can be swallowed by one of the .*s and never be checked against the lookahead.

Suggestions:

  1. Use your original idea - test against the extensions: ^.*\.(?!css|js|font|png|jpeg)[a-z0-9]+$ (with case insensitive).
  2. Instead of using the regular expression to do this, use a regex that will pull any URL (e.g., https?:\/\/\S\+, perhaps?) and then test each one you find with String.indexOf: if(candidateURL.indexOf('content/media')==-1) { /*do something with the OK URL */ }

Upvotes: 0

Related Questions