Ahmet Cetin
Ahmet Cetin

Reputation: 3823

PIL - saving file in memory?

I just want to open the image files in a folder, and convert them to jpeg if they are not already jpeg. Only thing is I need to save the file in memory, not to file. The reason is, in fact I'm reading the images from tfrecod file (tensorflow data file format), extract the image from it, check the file format, if not jpeg, convert to jpeg and then write back to tfrecord file after decoding properly. Because tensorflow object detection api doesn't accept any image format than jpeg unfortunately. Anyways, that's just the explanation why I need it.

To be able to do that, I need to keep the file in memory. So here is my code:

for counter, filename_with_path in enumerate(filenames):
    e = next(iter(tf.data.TFRecordDataset([filename_with_path])))
    example = tf.train.Example()
    example.ParseFromString(e.numpy())
    parsed = example.features.feature
    image_raw = parsed['image/encoded'].bytes_list.value[0]

    # After this point is important
    stream = BytesIO(image_raw)
    image = Image.open(stream) # Image is pillow image
    stream.close()

    if image.format != 'JPEG':
        tempFile = BytesIO()
        image.convert('RGB')
        image.save(tempFile, format="JPEG")

        newStream = BytesIO(tempFile)
        img = Image.open(newStream)
        newStream.close()

        print(filename, image.format)
        print(filename, img.format)

When I run this, I get ValueError: I/O operation on closed file. on the line

image.save(tempFile, format="JPEG")

Any idea why this gives error? I saw this as suggested way to write in memory file: How to write PNG image to string with the PIL?

Upvotes: 1

Views: 4886

Answers (1)

Ronald
Ronald

Reputation: 1039

The error is not about tempFile but about stream. You should not do stream.close() until you are done with image. This is a lazy API, so it can handle large images more efficiently.

for counter, filename_with_path in enumerate(filenames):
    ...

    stream = BytesIO(image_raw)
    image = Image.open(stream) # Image is pillow image
    # remove this line:
    # stream.close()

    if image.format != 'JPEG':
        tempFile = BytesIO()
        image.convert('RGB')
        image.save(tempFile, format="JPEG")

        # this wants bytes, not another BytesIO object, so read it
        newStream = BytesIO(tempFile.read())
        img = Image.open(newStream)
        # same thing, don't close until you are done with img
        # newStream.close()

        print(filename, image.format)
        print(filename, img.format)

From the Pillow's Image.open docs:

This is a lazy operation; this function identifies the file, but the file remains open and the actual image data is not read from the file until you try to process the data (or call the load() method).

Upvotes: 4

Related Questions