Alholi Hamed
Alholi Hamed

Reputation: 37

Error scraping link from site using a regex

I am trying to get matches from some text using regex, but the code fails to yield any results.

The text contains

action="https://www.localhost.com/en/account?dwcont=C338711466"

My code is

HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create("https://www.localhost.com/en/account");
httpWebRequest.Method = "GET";
httpWebRequest.CookieContainer = this.cookieJar;
string text2;
using (StreamReader streamReader = new StreamReader(httpWebRequest.GetResponse().GetResponseStream()))
{
   string text = streamReader.ReadToEnd().Trim().ToString();
   string[] array = (from Match match in Regex.Matches(text, "\"https://www.localhost.com/en/account?dwcont=(.+?)\"")
                     select match.Groups[1].Value).ToArray<string>();
   text2 = array[0];
}

MessageBox.Show(text2);

I get error in array:

System.IndexOutOfRangeException: 'Index was outside the bounds of the array.'

Is there a solution for it?

Upvotes: 1

Views: 151

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627302

You may get your array using

var array = Regex.Matches(text, "\"https://www\\.localhost\\.com/en/account\\?dwcont=([^\"]+)")
    .Cast<Match>()
    .Select(x => x.Groups[1].Value);

Then, get the first match using

text2 = array.FirstOrDefault();

Note you need to escape literal . and ? symbols in the regex pattern, and since you are using a regular string literal you should use double backslashes to create regex escapes.

You got the Index was outside the bounds of the array error because your regex failed to extract any match and array[0] was trying to access a null value.

Upvotes: 1

Related Questions