roland
roland

Reputation: 7775

Extract portion of these urls with RegEx and c#

I have to check if these two url match a pattern (or 2, to be more accurate). If so, I'd like to extract some portion of data.

1) /selector/en/any-string-chain-you-want.aspx?FamiId=32

Then I need to extract "en" and "32" into variables. To me the regex expression should like roughly something like /selector/{0}/any-string-chain-you-want.aspx?FamiId={1}

2) /selector/en/F/32/any-string-chain-you-want.html

where en and 32 must be assigned into variables. So: /selector/{0}/F/{1}/any-string-chain-you-want.html

{0}: 2 letters language code such as en, fr, nl, es,... {1}: family id integers (2 or 3 numbers) such as 12, 38, 124, etc but not 1212

Any idea on how to achieve it?

Thanks in advance,

Roland

Upvotes: 0

Views: 195

Answers (7)

Amir Ismail
Amir Ismail

Reputation: 3883

you can use something like that

String urlSchema1= @"/selector/(<lang>\w\w)/.+\.aspx?FamiId=(<FamiId>\d+)";

Match mExprStatic = Regex.Match(inputString,urlSchema1, RegexOptions.IgnoreCase | RegexOptions.Singleline);
 if (mExprStatic.Success || !string.IsNullOrEmpty(mExprStatic.Value))
 {
    String language = mExprStatic.Groups["lang"].Value;
    String FamId = mExprStatic.Groups["FamId"].Value;
 }

String urlSchema2= @"/selector/(<lang>\w\w)/F/(<FamId>\d+)/.+\.html";

Match mExprStatic = Regex.Match(inputString,urlSchema2, RegexOptions.IgnoreCase | RegexOptions.Singleline);
 if (mExprStatic.Success || !string.IsNullOrEmpty(mExprStatic.Value))
 {
    String language = mExprStatic.Groups["lang"].Value;
    String FamId = mExprStatic.Groups["FamId"].Value;
 }

Upvotes: 1

IndigoDelta
IndigoDelta

Reputation: 1481

Its useful to learn a bit of regular expression for cases like this. RegExr is a free online RegEx building tool. However, the most useful I have found is Expresso

Upvotes: 1

Ruel
Ruel

Reputation: 15780

string str = @"/selector/en/any-string-chain-you-want.aspx?FamiId=32";
Match m = Regex.Match(str, @"/selector/(\w{2})/.+\.aspx\?FamiId=(\d{2,3})");
string result = String.Format(@"/selector/{0}/F/{1}/any-string-chain-you-want.html", m.Groups[1].Value, m.Groups[2].Value);

There you go.

Upvotes: 1

Aliostad
Aliostad

Reputation: 81700

Case 1

private const string Regex1 = @"/selector/(\w\w)/.+\.aspx?FamiId=(\d+)";

Case 2

private const string Regex2 = @"/selector/(\w\w)/F/(\d+)/.+\.html";

Usage

Match m = Regex.Match(myString, Regex2);
string lang = m.Groups[1].Value;
string numericValue = m.Groups[2].Value;

Upvotes: 1

hsz
hsz

Reputation: 152304

You can try with:

^/.*?/(\w{2})/(?:F/|.*?FamiId=)(\d{2,3}).*$

It works for both urls.

Upvotes: 1

Episodex
Episodex

Reputation: 4559

This is the regex:

/selector/([a-z]{2})/.+?\.aspx\?FamiId=([0-9]+)

Code:

var regex = new Regex(@"/selector/([a-z]{2})/.+?\.aspx\?FamiId=([0-9]+)");
var test = "/selector/en/any-string-chain-you-want.aspx?FamiId=32";

foreach (Match match in regex.Matches(test))
{
    var lang = match.Groups[1].Value;
    var id = Convert.ToInt32(match.Groups[2].Value);

    Console.WriteLine("lang: {0}, id: {1}", lang, id);
}

Regex for second case: /selector/([a-z]{2})/F/([0-9]+)/.+?\.html (code doesn't change)

Upvotes: 1

mdm
mdm

Reputation: 12630

You should have a look at this tutorial on Regular Expressions.

You could use the following expressions:

\/selector\/([a-z]{2})\/.*\.aspx\?FamiId=([0-9]{2,3})

and

\/selector\/([a-z]{2})\/F\/([0-9]{2,3})\/.*\.html

Upvotes: 1

Related Questions