Reputation: 321
I am using java mail, and facing an issue with the following error: java.io.UnsupportedEncodingException: us-ascii big5 at sun.nio.cs.StreamDecoder.forInputStreamReader
The following is the Mime header causing this issue.
Content-Type: text/plain; charset="us-ascii, big5"
(I see non english characters on the content)
Is this valid? what could be a solution?
One more related issue, i see different variations of charsets(spl characters around the charset value) that causes this exception: eg.
charset="'UTF-8'"
charset=`UTF-8`
charset=UTF=8
charset=utf
charset=\"UTF-8\" etc.,
Note that this does not happen only to utf-8, happens to other char sets too, however email clients like outlook etc., opens and decodes these emails smartly.
Any ideas?
Upvotes: 1
Views: 1947
Reputation: 11045
Can you try message.setHeader("Content-Type", "text/plain; charset=UTF-8")?
No, messages come in (i have no control) and i had to run javamail lib to parse to get content. the incoming messages are not created by me
Use the mail.mime.contenttypehandler
system property to transform transform the content type without actually modifying the emails.
package cool.part.team;
import java.util.Arrays;
import javax.mail.Session;
import javax.mail.internet.ContentType;
import javax.mail.internet.MimeMessage;
import javax.mail.internet.MimePart;
public class EverythingIsAscii {
/**
* -Dmail.mime.contenttypehandler=cool.part.team.EverythingIsAscii
*/
public static void main(String[] args) throws Exception {
MimeMessage msg = new MimeMessage((Session) null);
msg.setText("test", "us-ascii, big5");
msg.saveChanges();
System.out.println("Transformed = "+ msg.getContentType());
System.out.println("Original = " + Arrays.toString(msg.getHeader("Content-Type")));
}
public static String cleanContentType(MimePart p, String mimeType) {
if (mimeType != null) {
String newContentType = mimeType;
try {
ContentType ct = new ContentType(mimeType);
String cs = ct.getParameter("charset");
if (cs == null || cs.contains("'")
|| cs.contains(",")) { //<--Insert logic here
ct.setParameter("charset", "us-ascii");
newContentType = ct.toString();
}
} catch (Exception ignore) {
//Insert logic to manually repair.
//newContentType = ....
}
return newContentType;
}
return mimeType;
}
}
Which will output:
Transformed = text/plain; charset=us-ascii
Original = [text/plain; charset="us-ascii, big5"]
You must correct this example code to do a proper transformation of the charset as everything is not ASCII.
Upvotes: 2
Reputation: 29961
All of those are invalid charsets. Whenever possible, report such problems to the owners of the programs that created these messages. If the messages are spam (they often are), just throw them away; these errors are a pretty good heuristic for detecting spam.
The JavaMail FAQ has strategies for dealing with these errors.
Upvotes: 1