zeke
zeke

Reputation: 3663

Extracting attachment from an Email - cannot get filename of attachment

I have a PHP script that checks an email account for new messages and attempts to download .zip and .pdf attachments from each email. I'm using the following code to do this:

/* try to connect */
$inbox = imap_open($hostname, $username, $password) or die ('Cannot connect to domain:' . imap_last_error());

/* grab emails */
$emails = imap_search($inbox, 'ALL');

/* put the newest emails on top */
rsort($emails);

foreach ($emails as $email_number) {
    $overview = imap_fetch_overview($inbox, $email_number, 0);
    if ($overview [0]->seen) {
        continue;
    }

    $structure = imap_fetchstructure($inbox, $email_number);
    if (!property_exists($structure, 'parts')) {
        continue;
    }
    //print_r($structure->parts);
    //get attachments
}

For most emails, $structure->parts looks something like this:

    [1] => stdClass Object
        (
            [type] => 3
            [encoding] => 3
            [ifsubtype] => 1
            [subtype] => PDF
            [ifdescription] => 0
            [ifid] => 0
            [bytes] => 132780
            [ifdisposition] => 1
            [disposition] => attachment
            [ifdparameters] => 1
            [dparameters] => Array
                (
                    [0] => stdClass Object
                        (
                            [attribute] => filename
                            [value] => some_filename.pdf
                        )

                )

            [ifparameters] => 1
            [parameters] => Array
                (
                    [0] => stdClass Object
                        (
                            [attribute] => name
                            [value] => some_filename.pdf
                        )

                )

        )

    [2] => stdClass Object
        (
            [type] => 3
            [encoding] => 3
            [ifsubtype] => 1
            [subtype] => ZIP
            [ifdescription] => 0
            [ifid] => 0
            [bytes] => 43170
            [ifdisposition] => 1
            [disposition] => attachment
            [ifdparameters] => 1
            [dparameters] => Array
                (
                    [0] => stdClass Object
                        (
                            [attribute] => filename
                            [value] => another_filename.zip
                        )

                )

            [ifparameters] => 1
            [parameters] => Array
                (
                    [0] => stdClass Object
                        (
                            [attribute] => name
                            [value] => another_filename.zip
                        )

                )

        )

As you can see it's easy to figure out the extension and filename of each attachment. However, recently I've been getting some emails where $structure->parts looks like this instead:

    [1] => stdClass Object
        (
            [type] => 3
            [encoding] => 3
            [ifsubtype] => 1
            [subtype] => OCTET-STREAM
            [ifdescription] => 1
            [description] => =?utf-8?B?Q1RHIFF1ZXLDqXRhcm8gIC0gIC0gMTJfOF8xNi5wZGY=?=
            [ifid] => 1
            [id] => <[email protected]>
            [bytes] => 44592
            [ifdisposition] => 1
            [disposition] => attachment
            [ifdparameters] => 1
            [dparameters] => Array
                (
                    [0] => stdClass Object
                        (
                            [attribute] => filename
                            [value] => =?utf-8?B?Q1RHIFF1ZXLDqXRhcm8gIC0gIC0gMTJfOF8xNi5wZGY=?=
                        )

                    [1] => stdClass Object
                        (
                            [attribute] => size
                            [value] => 32586
                        )

                    [2] => stdClass Object
                        (
                            [attribute] => creation-date
                            [value] => Thu, 08 Dec 2016 22:16:31 GMT
                        )

                    [3] => stdClass Object
                        (
                            [attribute] => modification-date
                            [value] => Thu, 08 Dec 2016 22:16:31 GMT
                        )

                )

            [ifparameters] => 1
            [parameters] => Array
                (
                    [0] => stdClass Object
                        (
                            [attribute] => name
                            [value] => =?utf-8?B?Q1RHIFF1ZXLDqXRhcm8gIC0gIC0gMTJfOF8xNi5wZGY=?=
                        )

                )

        )

    [2] => stdClass Object
        (
            [type] => 3
            [encoding] => 3
            [ifsubtype] => 1
            [subtype] => OCTET-STREAM
            [ifdescription] => 1
            [description] => =?utf-8?B?Q1RHIFF1ZXLDqXRhcm8gIC0gIC0gMTJfOF8xNi56aXA=?=
            [ifid] => 1
            [id] => <[email protected]>
            [bytes] => 10966
            [ifdisposition] => 1
            [disposition] => attachment
            [ifdparameters] => 1
            [dparameters] => Array
                (
                    [0] => stdClass Object
                        (
                            [attribute] => filename
                            [value] => =?utf-8?B?Q1RHIFF1ZXLDqXRhcm8gIC0gIC0gMTJfOF8xNi56aXA=?=
                        )

                    [1] => stdClass Object
                        (
                            [attribute] => size
                            [value] => 8011
                        )

                    [2] => stdClass Object
                        (
                            [attribute] => creation-date
                            [value] => Thu, 08 Dec 2016 22:16:31 GMT
                        )

                    [3] => stdClass Object
                        (
                            [attribute] => modification-date
                            [value] => Thu, 08 Dec 2016 22:16:31 GMT
                        )

                )

            [ifparameters] => 1
            [parameters] => Array
                (
                    [0] => stdClass Object
                        (
                            [attribute] => name
                            [value] => =?utf-8?B?Q1RHIFF1ZXLDqXRhcm8gIC0gIC0gMTJfOF8xNi56aXA=?=
                        )

                )

        )

These attachments are again a PDF and a ZIP and when using an email client they look the same as attachments in any other email. But as you can see above, instead of blahblah.zip and blahblah.pdf for the filenames, it's showing something like "=?utf-8?B?Q1RHIFF1ZXLDqXRhcm8gIC0gIC0gMTJfOF8xNi56aXA=?=" instead. Also, the subtype for both is 'OCTET-STREAM' instead of 'zip' or 'pdf'. So I don't know what each type of attachment is and can't do anything with the email.

Any help would be greatly appreciated. To sum up, I'm just trying figure out how to properly extract the attachment info from this certain segment of emails that are behaving differently.

Upvotes: 1

Views: 1515

Answers (1)

HawkHogan
HawkHogan

Reputation: 96

Those are mime encoded filenames.

=?utf-8?B?

This means it is a UTF-8, Base64 Encoded, string.

Check out iconv_mime_decode

Upvotes: 3

Related Questions