Kachou
Kachou

Reputation: 83

Combine checking for a valid URL and extracting file extension

I am checking if a string is a valid URL, and then use another regex to get the file extension inside the URL.

This is the code I am using:

public string GetUrlFileName(string url) {

        string fileExtension = string.Empty;

        **Regex regex = new Regex(@"(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*\.(?:jpg|gif|png|pdf))(?:\?([^#]*))?(?:#(.*))?");**

        Match match = regex.Match(url.ToLower());

        if(match.Success) {

            **Regex regexEnetnsion = new Regex(@"([\w]{2,4})(\?|$)");**

            Match GetExtension = regexEnetnsion.Match(url);

            if(GetExtension.Success) {

                fileExtension = GetExtension.Value;
            }
        }
        return fileExtension;
    }

However, I'd like to combine these regular expressions to use just one.

Upvotes: 1

Views: 115

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627087

Use capturing group instead of a non-capturing:

Besides grouping part of a regular expression together, parentheses also create a numbered capturing group. It stores the part of the string matched by the part of the regular expression inside the parentheses.

Thus, you can just remove ?::

(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*\.(jpg|gif|png|pdf))(?:\?([^#]*))?(?:#(.*))?
                                          ^

Group 4 will contain your extension.

C#:

Regex regex = new Regex(@"(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*\.(jpg|gif|png|pdf))(?:\?([^#]*))?(?:#(.*))?");
Match match = regex.Match(url.ToLower());
if(match.Success) {
    string ext = match.Groups[4].Value;
 ...
}

See RegexStorm demo, see Table tab

Upvotes: 1

Related Questions