Marcus
Marcus

Reputation: 9439

Regex help - stripping out the domain name

Say my URL is

What would the regex be to get

Upvotes: 0

Views: 1414

Answers (2)

Kash
Kash

Reputation: 9019

Code

Using javascript (you can test it here):

To return "/somedirectory/somefile.php":

var inputString = "https://foo.bar.com/somedirectory/somefile.php";
var regex = /https?:[\/]{2}\S*?(\/\S*)/;
var outputString = inputString.replace(regex,"$1");
alert(outputString);​

To return "somedirectory/somefile.php", change the regex like this:

var regex = /https?:[\/]{2}\S*?\/(\S*)/;

RegEx

The core regex is as follows. This may need to be tweaked a bit based on which language you are using:

https?:[\/]{2}\S*?(\/\S*)   

To extend this for ftp sites, you could use

(ht|f)tps?:[\/]{2}\S*?(\/\S*)

Upvotes: 1

user554546
user554546

Reputation:

Assuming that your URL always starts with either http:// or https://, this should work (and since you didn't specify a language, here's an implementation in Perl):

use strict;
use warnings;

my @urls=("https://foo.bar.com/somedirectory/somefile.php", "http://abc.bar.co.uk/somedirectory/somefile.php");

foreach my $url(@urls)
{
  if($url=~/^http(?:s)?:\/\/[^\/]+(\/.*)$/)
  {
    print "$1\n";
  }
  else
  {
    print "$url did not match!\n";
  }
}

The output is:

/somedirectory/somefile.php
/somedirectory/somefile.php

Upvotes: 2

Related Questions