Reputation: 811
I wrote an application where I fetch a message and check it's content:
public void getInhoud(Message msg) throws IOException, Exception {
Object contt = msg.getContent();
...
if (contt instanceof String) {
handlePart((Part) msg);
}
...
}
public void handlePart(Part part)
throws MessagingException, IOException, Exception {
ByteArrayOutputStream out = new ByteArrayOutputStream();
String contentType = part.getContentType();
...
if ((contentType.length() >= 9)
&& (contentType.toLowerCase().substring(
0, 9).equals("text/html"))) {
part.writeTo(out);
String stringS = out.toString();
}
...
}
I removed the unnecessary code. This methods works for e-mail which was sent from Gmail, Hotmail and the Outlook desktop client, but somehow fails to work with e-mails which were sent from the Office 365 web client. For every other client the content type will be 'plain/text' but only for Office 365 mail it will be text/html
. It is writing the data of the Part
to an ByteArrayOutputStream
which then will be converted to a String
. This works, well atleast the String
will contain the content of the part. But the HTML it contains is somewhat faulty.
Here is an example: http://pastebin.com/5mEYCHxD (posted to Pastebin, it is pretty big).
Notice the =
symbols which are printed at the end of almost every line. Is this something I can fix within in the code or should it be somewhere in the mailclient?
I thought about looping trough every line of HTML and removing the =
after having checked it is not a part an HTML tag.
Any help is very much appreciated, this has been bothering me for a few weeks now.
Thanks!
Upvotes: 1
Views: 2720
Reputation: 34024
That sounds just like quoted printable encoding:
Lines of quoted-printable encoded data must not be longer than 76 characters. To satisfy this requirement without altering the encoded text, soft line breaks may be added as desired. A soft line break consists of an "=" at the end of an encoded line, and does not appear as a line break in the decoded text.
The writeTo
method seems to also write the encoded content, it seems you have to copy the streams yourself. The getInputStream
method is described as returning the decoded InputStream.
Upvotes: 1