Julian
Julian

Reputation: 187

Regular Expression to prevent Email Name Spoofing

I want to match everything where .com or my\s?example appears in the display name of a From header and where the From email address is not .*@myexample.com.

It's easy when the display name is enclosed by quotation marks, but fails when the quotation marks are absent.

"(.*?(my\s?example|\.com).*?)"(?!\s?\<.*?\@myexample\.com\>)

Please see here: https://regexr.com/5im6l

Everything works as desired except for the last line in the input field, where the double quotes are missing. I would like it to also match for this.

Upvotes: 0

Views: 173

Answers (1)

The fourth bird
The fourth bird

Reputation: 163372

If an if clause is supported, and you want to capture what is between the double quotes if they are both there or capture the whole string if there are no double quotes at the start and end, you might use:

\bFrom:\s(")?(.*?\b(my\s?example|\.com)\b.*?)(?(1)")\s+<(?!\s?[^\r\n<>]*@myexample\.com>)

The pattern matches:

  • \bFrom:\s(")? A word boundary, match From: and optionally capture " in group 1
  • (.*?\b(my\s?example|\.com)\b.*?) Capture group 2, match a part that contains either myexample or .com where the alternatives are in group 3
  • (?(1)") If clause, if group 1 exists, match " so it is not part of the capture group
  • \s+< Match 1+ whitespace chars and <
  • (?! Negative lookahead, assert that what is at the right is not
    • \s?[^\r\n<>]*@myexample\.com> Match @myexample\.com between the brackets
  • ) Close lookahead

Group 2 contains the whole match, and group 3 contains a part with either Myexample or .com using a case insensitive match.

Regex demo


If \K is supported to forget what is matched so far, and you want as another example a match only:

\bFrom:\s"?\K.*?\b(?:my\s?example|\.com)\b.*?(?="?\s<(?![^<>]*@myexample\.com>))

Regex demo

Note that you don't have to escape \< \> and \@

Upvotes: 1

Related Questions