Reputation: 4501
I am working on an email client, and I wonder what is the correct algorithm for deciding whether an attachment is a regular attachment (a downloadable file like pdf, video, audio, etc...) or an inline attachment (which is just an embedded part of an HTML letter).
Until recently, I've checked whether body type (assuming the message part is not multipart, otherwise I would recursively parse ir further) is not TEXT. That is, whether it's APPLICATION,
IMAGE
, AUDIO
or VIDEO.
If that's the case, I looked at whether the nineth element is equal to ATTACHMENT
or INLINE
. I thought that if it's INLINE, then it is an embedded HTML particle, rather than a regular attachment.
However, recently I have across an email that contained some HTML message body and regular attachments. The problem is that its body structure looked like this:
1. mutlipart/mixed
1.1. mutlipart/alternative
1.1.1. text/plain
1.1.2. multipart/relative
1.1.2.1. text/html
1.1.2.2. Inline jpeg
1.1.2.3. Inline jpeg
1.2. pdf inline (why 'inline'? Should be 'attachment')
1.3. pdf inline (why 'inline'? Should be 'attachment')
The question is, why downloadable pdf files are of type INLINE? And what is the appropriate algorithm for determining whether a file is embedded html particle or a downloadable file? Should I look at the parent subtype to see whether it's relative
or not and disregard inline vs attachment parameters?
Upvotes: 1
Views: 1023
Reputation: 10985
There really is no defined one-size-fits-all algorithm. inline
or attachment
is something the sender sets, and is a hint on whether they want it to be displayed inline
(automatically rendered), as an attachment
(displayed in a list), or neither (no preference).
There is also what is sometimes called "embedded" attachments, which are attachments with a Content-ID
(this is in the body structure response) and is referenced by a cid:
reference in an <img> tag or the like.
So, this pretty much has to be done heuristically.
It really depends on your needs and your clients capabilities, but here is a list of heuristics you may consider using in some combination (some of these are mutually exclusive):
image/*
, maybe text/*
if you like), then it is inline.Also, the original version of inline
only meant the sender wanted it automatically rendered; this is often conflated with referenced by the HTML section
(which I've called embedded). These are not quite the same.
Upvotes: 5