sethvargo
sethvargo

Reputation: 26997

Ruby Mail gem extract headers and clean up body

I'm working with the Ruby Mail gem to try to read and parse out emails. I'm having a similar problem as this post, but the solutions provided only solved half my problems...

There are some email messages that don't respond to html_part or text_part, yet they are still considered "multipart". That's fine, I'll resort to manually looking at the MIME types. However, there are multipart message that have no parts!

I have a message, that message.multipart? #=> true, but message.parts.length #=> 0. As such, I can't actually extract a single part :(.

If I look at message.body (or message.body.decoded), there IS text there, and the type is text/html. However, it also has all the header information at the top.

It sounds crazy, but how can I get just the body (or just the headers) of this "multipart" email that has no parts?

Edit

Here's one of the messages in question:

#<Mail::Message:70280791538440, Multipart: true, Headers: ...>

With message.body:

--XX-2ba4f992ec6d5e224ebeaf78eac50df5\nContent-type: text/html; charset=\"UTF-8\" \nContent-Transfer-Encoding: 7bit \n\nThank you...

Upvotes: 2

Views: 2457

Answers (1)

joelparkerhenderson
joelparkerhenderson

Reputation: 35443

Yes, it's because real-world email has all kinds of surprises that don't fit the protocol.

To get the header part and body part:

header_part, body_part = message.body.split(/\n\s*\n/m, 2)

You may find some useful patterns for your parsing in this file:

lib/mail/patterns.rb

Upvotes: 4

Related Questions