Jamie
Jamie

Reputation: 3

Regular Expression to parse hostname from URL in C#?

Hey i'm curious as to how to parse out the host name in a URL using regular expressions in C#.

i have the following regex:

Regex regexUrl = new Regex("://(?<host>([a-z\\d][-a-z\\d]*[a-z\\d]\\.)*[a-z][-a-z\\d]+[a-z])");

but it throws an error when the URL does not contain a "http://", and it also does not parse out the "www." part of the url.

So how would i code a function that parses out the "hostname.com" from a URL, even if it does not contain a "http://". Thanks :)

Upvotes: 0

Views: 8449

Answers (4)

Serguei
Serguei

Reputation: 2958

If you insist on using a regex this should do: ^([a-z]+://)?(?<host>[a-z\d][a-z\d-]*(\.[a-z\d][a-z\d-]*)*)[/$]

The trick is to have the last character match either a / or the terminator ($)

Upvotes: 0

www0z0k
www0z0k

Reputation: 4434

[^\/\.\s]+\.[^\/\.\s]+\/ - the only problem is that it requires / after hostname

Upvotes: -1

Ben
Ben

Reputation: 21249

I wouldn't use regular expressions.

  1. Convert 'http://' to '' (empty string) in your string - that basically removes http:// if it's there
  2. Split the string on / as an array
  3. The hostname is the element at index 0

Upvotes: 4

Josh
Josh

Reputation: 69252

Why not do somethiing like this instead?

Uri uri;
if (!Uri.TryCreate(s, UriKind.Absolute, out uri)) {
    if (!Uri.TryCreate("http://" + s, UriKind.Absolute, out uri)) {
        throw new ArgumentException();
    }
}

return uri.Host;

It's more lines but it's probably cleaner than a regex and easier to read.

Upvotes: 3

Related Questions