javascript regex or string method to only match subdomain and domain (minus top levels)

Question

In a browser I want to figure out what the subdomain and domain name for the page I am on is, minus the top levels like 'com' and '.co.uk'.

Also, if the subdomain is 'www' I don't want a match on that.

Examples:

https://www.voice-1.mozilla.co.uk/folder/index.html
https://www.voice-1.mozilla.org.uk/folder/index.html
http://www.voice-1.mozilla.com/folder/index.html
http://www.voice-1.mozilla.com:8080/folder/index.html

will all have the matches voice-1 and mozilla

It would be nice to not have to maintain top level domains, but maintaining different variations of www is okay.

So far I've gotten to skip com and co.uk but not www or org.uk and not anything else before a . in the file path: regex-test

The regex is now:

/[\w\-]{3,}(?=[.])/g

How to go about to achieve this?

Edit: Having a step after the regex, trimming away unwanted www, co in co.uk and org in org.uk is okay. But I still need to get the top level removed and anything else before a . in the file-path. Basically grabbing everything between // and first /, except top level domain.

Egan Wolf · Accepted Answer

I managed to get this. Got rid of www and index.

\.([\w\-]{3,})(?=[\.])

If string methods are allowed, you can try something like this.

str = 'https://www.voice-1.mozilla.co.uk/folder/index.html'
arr = str.split('/')
result = arr[2].split('.')

You will get every part separately in result. You need to check first element (is it www or not), same for last two elements (check length and content). I don't think there is any pattern you can use here.

javascript regex or string method to only match subdomain and domain (minus top levels)

Answers (1)

Related Questions