Reputation: 1325
I am collecting emails from an IMAP server with the code beneath, but the content of the email body is often very ugly, and sometimes impossible to understand. Many of the emails contains Danish and Swedish special characters like e.g. æ, ä, ö, ø and å, but I don't think that is the problem. How best to encode and clean up?
imap = Net::IMAP.new(address, port, enable_ssl?)
imap.login(user_name, password)
imap.examine(flag)
search_query = "#{last_uid}:*"
imap.uid_search(search_query).each do |uid|
if uid.to_i > last_uid.to_i
header = imap.uid_fetch(uid, "BODY[HEADER.FIELDS (FROM TO DATE SUBJECT)]")[0].attr["BODY[HEADER.FIELDS (FROM TO DATE SUBJECT)]"]
from = Mail.read_from_string(header).from.first
to = Mail.read_from_string(header).to.first rescue nil
subject = Mail.read_from_string(header).subject
date = Mail.read_from_string(header).date
body = imap.uid_fetch(uid, "BODY[TEXT]")[0].attr["BODY[TEXT]"].gsub(/\r\n?/, "\n").force_encoding('UTF-8')
end
end
imap.logout()
imap.disconnect()
Sample body content:
1:
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS08YnI+DQpPcmRy
ZWRhdG86IDI4LTAzLTIwMTMgMTQ6NDc6MTg8YnI+DQpPcmRyZW51bW1lcjogMTA5MDM1PGJy
Pg0KVHJhbnNha3Rpb25zSUQ6IDE2NzgyMQ0KPGJyPjxicj4NCkZha3R1cmVyaW5nc2FkcmVz
c2U6PGJyPg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLTxicj48YnI+DQpOaWtsYXMgSnV1bCBOaWVs
c2VuPGJyIC8+QS5QLiBNw7hsbGVyIEtvbGxlZ2lldCAxMDU8YnIgLz41NzAwIFN2ZW5kYm9y
ZzxiciAvPkRlbm1hcms8YnIgLz5UTEY6OiAyMDYzMDczNzxiciAvPjxhIGhyZWY9Im1haWx0
bzpuaWtzQGxpdmUuZGsiPm5pa3NAbGl2ZS5kazwvYT48YnIgLz4NCjxicj48YnI+DQpMZXZl
cmluZ3NhZHJlc3NlOjxicj4NCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS08YnI+PGJyPg0KTmlrbGFz
IEp1dWwgTmllbHNlbjxiciAvPkEuUC4gTcO4bGxlciBLb2xsZWdpZXQgMTA1PGJyIC8+NTcw
MCBTdmVuZGJvcmc8YnIgLz5EZW5tYXJrPGJyIC8+VExGOjogMjA2MzA3Mzc8YnIgLz48YSBo
cmVmPSJtYWlsdG86bmlrc0BsaXZlLmRrIj5uaWtzQGxpdmUuZGs8L2E+PGJyIC8+DQo8YnI+
PGJyPg0KT3JkcmVkYXRhOjxicj4NCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS08YnI+DQoNCiAgMSww
MCBzdGsuIFN0YXIgV2FycyBCYXR0bGVmcm9udCBJSSBYYm94ICg0MTAzMikgw6EgREtLIDI2
Myw5OSAtIElhbHQ6IERLSyAzMjksOTkNCjxicj4NCjxicj4NCkJldGFsaW5nOiAyOiBEYW5z
a2Uga3JlZGl0a29ydCBbdHJhbnNha3Rpb25zZ2VieXIgMSwyNSVdIChES0sgNCwxMykNCjxi
cj4NCkZvcnNlbmRlbHNlOiAgKERLSyAwLDAwKQ0KPGJyPjxicj4NClNhbWxldCBwcmlzIDog
REtLIDMzNCwxMg0KPGJyPg0KSGVyYWYgbW9tczogREtLIDY2LDgzDQo=
2 (shortened):
------=_NextPart_000_0482_01CE2B9E.A689A9F0
Content-Type: multipart/related;
boundary="----=_NextPart_001_0483_01CE2B9E.A689A9F0"
------=_NextPart_001_0483_01CE2B9E.A689A9F0
Content-Type: multipart/alternative;
boundary="----=_NextPart_002_0484_01CE2B9E.A689A9F0"
------=_NextPart_002_0484_01CE2B9E.A689A9F0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
=20
=09
=09
=09
=09
=20
=09
=09
=09
=09
=09
=09
=09
=09
=09
Daily Restock Information.
=09
=09
Item
Format
1+=20
5+ =20
Box Price=20
Qty
Barcode
=09
3 (shortened):
--Boundary-=_SHccxHuUYYhTGDGLfcIEBDUToEun
Content-Type: text/plain; charset="ISO-8859-1"
--Boundary-=_SHccxHuUYYhTGDGLfcIEBDUToEun
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="SYSTEMSTOCK.XLSX"
Content-Transfer-Encoding: base64
UEsDBBQABgAIAAAAIQC5OlcVkgEAAIwGAAATAN0BW0NvbnRlbnRfVHlwZXNdLnhtbCCi2QEooAAC
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAMRVyWrDMBC9F/oPRtcSK0mhlBInhy7HNpD0AxRrEovYktBMtr/v2FloimtI
HejF+7xl9EYejLZFHq0hoHE2Eb24KyKwqdPGLhLxOX3rPIoISVmtcmchETtAMRre3gymOw8YcbXF
RGRE/klKTDMoFMbOg+U3cxcKRXwbFtKrdKkWIPvd7oNMnSWw1KESQwwHLzBXq5yi1y0/3iuZGSui
5/13JVUilPe5SRWxULm2+gdJx83nJgXt0lXB0DH6AEpjBkBFHvtgmDFMgIiNoZDDwQebDkZDNFaB
etc..
Upvotes: 4
Views: 1212
Reputation: 2195
Dug around for many hours trying to solve this problem so adding my answer to a few of the threads I found...
https://stackoverflow.com/a/26604049/2386548
Hope that helps somebody...
Upvotes: 2