Ólafur Waage
Ólafur Waage

Reputation: 69981

How to extract the embedded attachment name from this email?

My regex skill is... bad. I have this email body.

Hello World

[cid:[email protected]]

Hello World

[cid:[email protected]] [cid:[email protected]]

Hello World

And what I need is an array of the filenames.

preg_match_all("/\[cid:(.*)\@/", $string, $matches);

echo "<pre>";
    print_r($matches);
echo "</pre>";

And I get the first filename fine. But not the 2nd and 3rd.

[0] => image002.png
[1] => [email protected]] [cid:image002.png

How can I change this regex so it works for any embedded file in an email body?

Upvotes: 1

Views: 2076

Answers (3)

jmtd
jmtd

Reputation: 1244

Please note that if you have the whole email (headers and all) you will get more consistent results by extracting the filenames from the email headers rather than the body (the format of which is heavily dependent on the chain of mail software that the email passed through before it reached you).

Read up on MIME, multipart messages and the MIME headers to see how to do this: http://en.wikipedia.org/wiki/Multipart_message#MIME_headers

Upvotes: 0

Stephen Doyle
Stephen Doyle

Reputation: 3744

Try:

preg_match_all("/\[cid:([^@]*)/", $string, $matches);

EDIT: Original post had a bug - apologies! I was trying to short circuit the greediness in an incorrect manner.

Upvotes: 0

braveterry
braveterry

Reputation: 3744

I think you can just change the expression to this:

"/\[cid:(.*?)\@/"

To make the match non-greedy.

Here are a couple of tools you can use to test your expressions:

Upvotes: 5

Related Questions