Hank Mooody
Hank Mooody

Reputation: 527

Method that extracts a list of URLs that match the pattern

Well, I'm trying to create a method, using Regex , that will extract a list of URLs that matches this pattern @"http://(www\.)?([^\.]+)\.com", and so far I've done this :

public static List<string> Test(string url)
    {
        const string pattern = @"http://(www\.)?([^\.]+)\.com";
        List<string> res = new List<string>();

        MatchCollection myMatches = Regex.Matches(url, pattern);

        foreach (Match currentMatch in myMatches)
        {

        }

        return res;


    }

main issue is , which code should I use in foreach loop

        res.Add(currentMatch.Groups.ToString());

or

            res.Add(currentMatch.Value);

Thanks!

Upvotes: 1

Views: 909

Answers (2)

Jeroen van Langen
Jeroen van Langen

Reputation: 22038

res.Add(currentMatch.Groups.ToString()); will give: System.Text.RegularExpressions.GroupCollection so you didn't test it.


How many matches do you expect from the parameter url?

I would use this:

static readonly Regex _domainMatcher =  new Regex(@"http://(www\.)?([^\.]+)\.com", RegexOptions.Compiled);

public static bool IsValidDomain(string url)
{
    return _domainMatcher.Match(url).Success;
}

or

public static string ExtractDomain(string url)
{ 
    var match = _domainMatcher.Match(url);
    if(match.Success)
        return match.Value;
    else
        return string.Empty;
}

Because the parameter is called url so it should be one url


If there are more possibilities and you want to extract all domainnames that matches the pattern:

public static IEnumerable<string> ExtractDomains(string data)
{
    var result = new List<string>();

    var match = _domainMatcher.Match(data);

    while (match.Success)
    {
        result.Add(match.Value);

        match = match.NextMatch();
    }

    return result;
}

Notice the IEnumerable<> instead of List<> because there is no need to modify the result by the caller.


Or this lazy variant:

public static IEnumerable<string> ExtractDomains(string data)
{
    var match = _domainMatcher.Match(data);

    while (match.Success)
    {
        yield return match.Value;
        match = match.NextMatch();
    }
}

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626691

You just need to get all .Match.Values. In your code, you should use

res.Add(currentMatch.Value);

Or, just use LINQ:

res = Regex.Matches(url, pattern).Cast<Match>()
           .Select(p => p.Value)
           .ToList();

Upvotes: 1

Related Questions