James
James

Reputation: 31748

Regex find a string within an url, non case sensitive

I have the url:

http://primarydomain.com/sites/secondarydomain/?foo=bar

What regex expression could I use to match the url for sites/secondarydomain - not case sensitive (this is for a rule in a web.config file but requires standard regex)?

To put it into context, I am writing a web.config url rewrite rule to remove sites/secondarydomain from all urls (due to a multiple sites being hosted on the same package).

<rule name="Remove full hosting path">
    <match url="***Regex goes here***" ignoreCase="true"/>
    <action type="Redirect" url="http://secondary.com/{R:1}" redirectType="Permanent" />
</rule>

I am looking to match only the directories (not the query string) in order to redirect the user (hence removing the sites/secondarydomain).

Update: It looks like I want to rewrite the url and not redirect, here is the current web.config rule that doesn't quite work:

    <rule name="TestRule">
      <match url=".*" />
      <conditions>
        <add input="{PATH_INFO}" pattern="^(/hostedsites/clemones_htdocs)(/.*)"/>
      </conditions>
      <action type="Rewrite" url="\{C:2}" appendQueryString="true" />
    </rule>

Where my secondary domain is http://clemones.com/ and the path I'm trying to get rid of: http://clemones.com/hostedsites/clemones_htdocs/

FOR testing, http://clemones.com/shizzle works as a destination (hence sadly http://clemones.com/hostedsites/clemones_htdocs/shizzle also works).

Thanks in advance

Upvotes: 1

Views: 2797

Answers (4)

swannee
swannee

Reputation: 3466

Have you tried:

To elaborate, this only applies the regex to the path, not the root url:

 <rule name="TestRule">
     <match url=".*" />
     <conditions>
         <add input="{PATH_INFO}" pattern="^(/sites/secondarydomain)(/.*)"/>
     </conditions>
     <action type="Rewrite" url="\{C:2}" appendQueryString="true" />
 </rule>

There are multiple groups resulting from the condition, {C:2} represents everything that comes after "/sites/secondarydomain/", excluding the query string which is appended by choosing "appendQueryString=true".

It allows you to break out the parts you want to take action on, so yes it is different than just applying a regular expression to the entire url.

Here is an article that explains how this works: http://weblogs.asp.net/owscott/archive/2010/01/26/iis-url-rewrite-hosting-multiple-domains-under-one-site.aspx

Upvotes: 1

goTo-devNull
goTo-devNull

Reputation: 9372

A combination lookbehind and lookahead will match the string you want:

(?<=.\w+/)\w+/\w+(?=/.*)

That being said, the {R:1} in your example really looks like a Regex backreference, so maybe that's why things aren't working as expected. If this is true, you may need something like this instead:

.\w+/(\w+/\w+)

Never done IIS rewriting, so YMMV. The two regular expressions do work (tested) on the examples you've given so far, and more generic URLs like:

http://primarydomain.com/hostedsites/clemones_htdocs/index.aspx?foo=bar
http://anydomain.net/sites/secondarydomain/index.aspx?foo=bar
...

Upvotes: 0

CBRRacer
CBRRacer

Reputation: 4659

if the domain is always going to be http://primarydomain.com/sites/ then I would attack it like this:

match url="http://primarydomain.com/sites/([A-Za-z0-9_]+)/.*";

Upvotes: 0

Squirrelsama
Squirrelsama

Reputation: 5570

Try a lookbehind (?<=(http://primarydomain.com/))[^\b]*

EDIT:

If you want to exclude the querystring... (?<=(http://primarydomain.com/))[^?]*

If you want to be more strict for whatever reason (like only allowing alphabet characters in the directory), you can try something like this (?<=(http://primarydomain.com/))[a-zA-Z/]*[a-zA-Z]

Upvotes: 0

Related Questions