Reputation: 2408
I have a list of domain names, e.g. developer.mozilla.org
. I need to extract the domain name only, e.g. mozilla.org
. I used RegExp
but did not get it right so far. Not sure what am I missing.
I wrote this javascript which does not capture the part I want exactly.
var arr = ["developer.mozilla.org", "cdn.mdn.mozilla.net", "www.google-analytics.com", "www.youtube.com"];
var arrLength = arr.length;
var reg = new RegExp('((\\.[a-zA-Z0-9]+)(\\.[a-zA-Z0-9]+))$');
for (i=0; i< arrLength; i++)
{
console.log(arr[i].match(reg))
}
Upvotes: 0
Views: 153
Reputation: 7388
You don't need a regex for this simple task.
var arr = ["developer.mozilla.org", "cdn.mdn.mozilla.net", "www.google-analytics.com", "www.youtube.com"];
var arrLength = arr.length;
for (var i = 0; i < arrLength; i++)
{
var parts = arr[i].split('.');
var domain = parts.slice(-2).join('.');
console.log(domain);
}
or a much shorter version:
for (var i = 0; i < arr.length; i++)
{
var domainName = arr[i].split('.').slice(-2).join('.');
console.log(domainName);
}
slice(-2)
extracts the last two elements in an array sequence.
Upvotes: 0
Reputation: 975
\w
will pick up underscore and hyphen. substring(1)
on the first element so you don't print the first dot. :)
let arr = ["developer.mozilla.org", "cdn.mdn.mozilla.net",
"www.google-analytics.com", "www.youtube.com"];
let expr = /(\.[\/\w\.-]+)(\.[a-zA-Z0-9]+)/;
let regex = new RegExp(expr);
arr.forEach(e => console.log(e.match(regex)[0].substring(1)));
Upvotes: 0
Reputation: 1141
It works if you write your code like this:
var arr = ["developer.mozilla.org", "cdn.mdn.mozilla.net", "www.google-analytics.com", "www.youtube.com"];
var arrLength = arr.length;
var reg = /[^.]+\.[^.]+$/
for (i=0; i< arrLength; i++)
{
console.log(arr[i].match(reg)[0])
}
Some explanations:
First of all there is a flaw in your regex that causes the 'google-analytics' entry to be missed. I would likely suggest that you write your regex like this instead
var reg = /[^.]+\.[^.]+$/
The regex you wrote has 2 capturing groups, this explains the arrays you are getting from your console.log
['.mozilla.org', '.mozilla', '.org'] = [matching string, capturedGroup1, capturedGroup2]
you could make your groups non-capturing by writing your regex like so:
var reg = new RegExp('(?:(?:\\.[a-zA-Z0-9]+)(?:\\.[a-zA-Z0-9]+))$');
or using a regex literal as @Bergi suggests
var reg = /(?:(?:\.[a-zA-Z0-9]+)(?:\.[a-zA-Z0-9]+))$/
in any case when you're using the match
method you'll get an array in return and what you're really interested in is the matched string, so the first element in the array. You'd get the expected result by rewriting the body of the loop like this
console.log((arr[i].match(reg) || [])[0]) // note I'm concerned with string.match returning null here
If you really dislike the array you could use string replace instead
console.log(arr[i].replace(/^.*\.([^.]+\.[^.]+)$/, '$1'))
Upvotes: 1