Ryan
Ryan

Reputation: 1628

Why is this regex not selecting only the email addresses?

RegExr example: http://regexr.com?333h5

Regex:

_email=(.+)[^\s"]

Sample Content:

http://example.com/index.php?package=icard&action=show_birthday_card&cid=3376&name=Gina&[email protected]
[ Unsubscribe from iCard: http://example.com/index.php?package=icard&action=unsubscribe_card&[email protected] ]


--1q2w3e4r5t6y7u8i9o0p1q
Content-Type: text/html;

<table align=""center"">
    <tr>
        <td>
            <div style=""background:#fff; border:10px solid #2d50d6; padding:10px; width:600px;"">
                <h4 style=""background:#2d50d6; color:#fff; border:18px solid #2d50d6; margin-bottom:10px; font-family:Arial, Helvetica, sans-serif; font-size:14px; text-align:center;"">Happy Birthday Gina!</h4>
                <a href=""http://example.com/index.php?package=icard&action=show_birthday_card&cid=3376&name=Gina&[email protected]""><img src=""http://example.com/images/brands/chiro/cards/img_46a6924710bcc_birthdayparty.gif"" alt="""" title=""Happy+Birthday+Gina%21"" width=""600"" style=""border:0; margin-bottom:10px; width:600px; height:150px;"" /></a>
                <p style=""color:#000; font-family:Arial, Helvetica, sans-serif; font-size:12px;"">We hope all your birthday dreams and wishes come true!</p>

Upvotes: 0

Views: 56

Answers (2)

EEP
EEP

Reputation: 725

This is selecting more than just the email addresses because the + is a greedy operator, and will attempt to match as many characters as possible.

Email address matching is a fairly common regex problem, as such there are many solutions with one included right in a great tutorial on how to use regular expressions. I would recommend using one of these well tested patterns as such...

_email=(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)

Upvotes: 1

femtoRgon
femtoRgon

Reputation: 33341

_email=(.+)[^\s"]

Means "_email=" followed by 1-many characters of any type, followed by a single character that is neither whitespace nor a quote.

I think what you are looking for is:

_email=[^\s"]+

Which is "_email=" followed by 1-many characters that are neither whitespace nor a quote.

Upvotes: 3

Related Questions