HuseyinUslu
HuseyinUslu

Reputation: 4134

c# regex - matching optionals after a named group

I'm sure this has been quite numerous times but though i've checked all similar questions, i couldn't come up with a solution.

The problem is that i've an input urls similar to;

  1. http://www.justin.tv/peacefuljay
  2. http://www.justin.tv/peacefuljay#/w/778713616/3
  3. http://de.justin.tv/peacefuljay#/w/778713616/3

I want to match the slug part of it (in above examples, it's peacefuljay).

Regex i've tried so far are;

 http://.*\.justin\.tv/(?<Slug>.*)(?:#.)?
 http://.*\.justin\.tv/(?<Slug>.*)(?:#.)

But i can't come with a solution. Either it fails in the first url or in others.

Help appreciated.

Upvotes: 1

Views: 241

Answers (3)

Kobi
Kobi

Reputation: 138007

The easiest way of parsing a Uri is by using the Uri class:

string justin = "http://www.justin.tv/peacefuljay#/w/778713616/3";
Uri uri = new Uri(justin);
string s1 = uri.LocalPath; // "/peacefuljay"
string s2 = uri.Segments[1]; // "peacefuljay"

If you insisnt on a regex, you can try someting a bit more specific:

Match mate = Regex.Match(str, @"http://(\w+\.)*justin\.tv(?:/(?<Slug>[^#]*))?");
  • (\w+\.)* - Ensures you match the domain, not anywhere else in the string (eg, hash or query string).
  • (?:/(?<Slug>[^#]*))? - Optional group with the string you need. [^#] limits the characters you expect to see in your slug, so it should eliminate the need of the extra group after it.

Upvotes: 3

yellowblood
yellowblood

Reputation: 1631

As I see it there's no reason to treat to the parts after the "slug".

Therefore you only need to match all characters after the host that aren't "/" or "#".

http://.*\.justin\.tv/(?<Slug>[^/#]+)

Upvotes: 2

sipsorcery
sipsorcery

Reputation: 30699

http://.*\.justin\.tv/(?<Slug>.*)#*?

or

http://.*\.justin\.tv/(?<Slug>.*)(#|$)

Upvotes: 0

Related Questions