Reputation: 1144
I am creating a tool that is required to parse incoming MIME streams and return the email body and email attachments as separate file streams.
I am using mime4j for this purpose.
Following are the problems that I am stuck on:
I have a large corpus of emails available in raw mime form that I want to run my tests on and need some automated way to determine which ones might be breaking the mime parsing by mime4j and tweak the code for that.
Upvotes: 0
Views: 371
Reputation: 1144
I initially parsed out a sample corpus *.eml files using mime4j. I had to manually check them for parsing errors as I had no other good choice.
Now I am using the earlier parsed out emails as testbed over which I check my parsed out results iteratively.
Upvotes: 0
Reputation: 38538
You could decode the attachments and then re-encode them. If the re-encoded stream matches (byte-for-byte) the original, then that's a good sign that mime4j is properly handling them.
Upvotes: 1