Reputation: 3
Hey i'm curious as to how to parse out the host name in a URL using regular expressions in C#.
i have the following regex:
Regex regexUrl = new Regex("://(?<host>([a-z\\d][-a-z\\d]*[a-z\\d]\\.)*[a-z][-a-z\\d]+[a-z])");
but it throws an error when the URL does not contain a "http://", and it also does not parse out the "www." part of the url.
So how would i code a function that parses out the "hostname.com" from a URL, even if it does not contain a "http://". Thanks :)
Upvotes: 0
Views: 8449
Reputation: 2958
If you insist on using a regex this should do: ^([a-z]+://)?(?<host>[a-z\d][a-z\d-]*(\.[a-z\d][a-z\d-]*)*)[/$]
The trick is to have the last character match either a /
or the terminator ($
)
Upvotes: 0
Reputation: 4434
[^\/\.\s]+\.[^\/\.\s]+\/
- the only problem is that it requires /
after hostname
Upvotes: -1
Reputation: 21249
I wouldn't use regular expressions.
Upvotes: 4
Reputation: 69252
Why not do somethiing like this instead?
Uri uri;
if (!Uri.TryCreate(s, UriKind.Absolute, out uri)) {
if (!Uri.TryCreate("http://" + s, UriKind.Absolute, out uri)) {
throw new ArgumentException();
}
}
return uri.Host;
It's more lines but it's probably cleaner than a regex and easier to read.
Upvotes: 3