Reputation: 183
I'm trying to strip out all https, http, www, /, : and . out of a domain name to create a user account folder on my system. So what I need is to make a URL that looks like this "https://www.My-Domain.com/" into "My-Domaincom" I'm close but just cant seem to get it to work.
our $DomainAccount = lc($ENV{HTTP_REFERER});
$DomainAccount =~ s/^http:\/\/|^https:\/\///;
$DomainAccount =~ s/^www\.|(/.)//;
Upvotes: 1
Views: 873
Reputation: 627082
You just need to make sure you match the http://
or https://
that is optionally followed with www.
, match and capture the host URL part up to the first /
and then match the rest, and replace with the backreference to the first capture group $1
, and in order to remove .
from the host.com
you need to use a second capturing group like this:
$DomainAccount =~ s/^https?:\/\/(?:www\.)?([^\/.]+)\.([^\/.]+).*/$1$2/i;
Output for "https://www.My-Domain.com/"
: My-Domaincom
See the regex demo here.
Note I added a case-insensitive flag /i
just to make sure the pattern can handle HTTP://
casing, too.
The regex matches:
^
- start of stringhttps?:\/\/
- a literal character sequence http://
or https://
(?:www\.)?
- one or zero occurrences of a literal character sequence www.
([^\/.]+)
- Group 1: one or more characters other than /
and .
\.
- a literal dot([^\/.]+)
- Group 2: one or more characters other than /
and .
.*
- rest of the lineTo address choroba's comment, here is a 2 step solution that will work with URLs containing more than one dot in the host part:
$DomainAccount =~ s/^https?:\/\/(?:www\.)?([^\/]+).*/$1/i;
$DomainAccount =~ s/\.//g;
Upvotes: 1
Reputation: 241988
URI can help you, but you still have to remove the www
yourself:
#! /usr/bin/perl
use warnings;
use strict;
use URI;
my $url = 'URI'->new('https://www.My-Domain.com/');
my $account = $url->host;
$account =~ s/^[^.]*\.// while 1 != $account =~ tr/.//;
$account =~ s/\.//;
print $account, "\n";
This only leaves the top and second level domains in the result (try with e.g. http://some.very.long.domain.name.com
).
Upvotes: 1