user1112324
user1112324

Reputation: 643

C# Regex returning multiple lines of text

I have the following function:

public static string ReturnEmailAddresses(string input)
    {

        string regex1 = @"\[url=";
        string regex2 = @"mailto:([^\?]*)";
        string regex3 = @".*?";
        string regex4 = @"\[\/url\]";

        Regex r = new Regex(regex1 + regex2 + regex3 + regex4, RegexOptions.IgnoreCase | RegexOptions.Multiline);
        MatchCollection m = r.Matches(input);
        if (m.Count > 0)
        {
            StringBuilder sb = new StringBuilder();
            int i = 0;
            foreach (var match in m)
            {
                if (i > 0)
                    sb.Append(Environment.NewLine);
                string shtml = match.ToString();
                var innerString = shtml.Substring(shtml.IndexOf("]") + 1, shtml.IndexOf("[/url]") - shtml.IndexOf("]") - 1);
                sb.Append(innerString); //just titles                    
                i++;
            }

            return sb.ToString();
        }

        return string.Empty;
    }

As you can see I define a url in the "markdown" format:

[url = http://sample.com]sample.com[/url]

In the same way, emails are written in that format too:

[url=mailto:[email protected]][email protected][/url]

However when i pass in a multiline string, with multiple email addresses, it only returns the first email only. I would like it to have multple matches, but I cannot seem to get that working?

For example

[url=mailto:[email protected]][email protected][/url] /r/n a whole bunch of text here /r/n more stuff here [url=mailto:[email protected]][email protected][/url]

This will only return the first email above?

Upvotes: 0

Views: 68

Answers (2)

Abion47
Abion47

Reputation: 24616

The mailto:([^\?]*) part of your pattern is matching everything in your input string. You need to add the closing bracket ] to the inside of your excluded characters to restrict that portion from overflowing outside of the "mailto" section and into the text within the "url" tags:

\[url=mailto:([^\?\]]*).*?\[\/url\]

See this link for an example: https://regex101.com/r/zcgeW8/1

Upvotes: 2

Saleem
Saleem

Reputation: 8978

You can extract desired result with help of positive lookahead and positive lookbehind. See http://www.rexegg.com/regex-lookarounds.html

Try regex: (?<=\[url=mailto:).*?(?=\])

Above regex will capture two email addresses from sample string

[url=mailto:[email protected]][email protected][/url] /r/n a whole bunch of text here /r/n more stuff here [url=mailto:[email protected]][email protected][/url]

Result:

[email protected]
[email protected]

Upvotes: 0

Related Questions