Luca Brasi
Luca Brasi

Reputation: 711

File Stream - ValueError: embedded null byte

I'm trying to download a .png image via HTTP requests and upload it via HTTP to another location. My objective is to avoid saving the file on the disk so it's processed in-memory.

I have the code below:

  1. Download the file and convert it into a byte array:
resp = requests.get(
    'http://www.personal.psu.edu/crd5112/photos/PNG%20Example.png',
    stream=True)

img = BytesIO(resp.content)
  1. Upload the file to a remote HTTP repository
data=open(img.getvalue()).read()

r = requests.post(url=url, data=data, headers=headers, auth=HTTPBasicAuth('user', 'user'))

I'm getting a ValueError exception "embedded null byte" when reading the byte array.

If I save the file onto the disk and load it as below, then there is no error:

with open('file.png', 'wb') as pic:
  pic.write(img.getvalue())

Any advice on how I could achieve it without saving the file on the disk ?

Upvotes: 3

Views: 18933

Answers (3)

Shrout1
Shrout1

Reputation: 2607

I believe that the embedded null byte error is caused by a filename input requirement of a library that is supporting whatever operation is being executed in your code. By using a BytesIO object this presents itself to that library "as if" it is wrapped inside a file.

Here is sample code that I used when trying to address this same issue with a tar file. This code should be able to satisfy most file input requirements for various other libraries.

The key that I found here was using the BytesIO object around the remote_file.content being passed into the tarfile.open as a file object. Other techniques I attempted did not work.

from io import BytesIO
import requests
import tarfile

remote_file=requests.get ('https://download.site.com/files/file.tar.gz')

#Extract tarball contents to memory
tar=tarfile.open(fileobj=BytesIO(remote_file.content))
#Optionally print all folders / files within the tarball
print(tar.getnames())
tar.extractall('/home/users/Documents/target_directory/')

This eliminated the ValueError: embedded null byte and expected str, bytes or os.PathLike object, not _io.BytesIO errors that I was experiencing with other methods.

Upvotes: 6

AmilaMGunawardana
AmilaMGunawardana

Reputation: 1830

Yes, you can do this without saving to the disk. Before that, the error occurred in line

data=open(img.getvalue()).read()

Since the inbuild string operation is not good with different encodings this error occured. use the pillow library to meddle with image realated situations

from io import BytesIO
from PIL import Image    
img = BytesIO(resp.content)
-#data=open(img).read()
+data = Image.open(img)

this will give you a following object type

<class 'PIL.PngImagePlugin.PngImageFile'>

you can use this data variable as your data in the upload post request

Upvotes: 3

Luca Brasi
Luca Brasi

Reputation: 711

@AmilaMGunawardana Thanks for the pointer.

I just had to save the image into a separate byte stream to get it uploaded properly:

img = BytesIO(resp.content)

data = Image.open(img, 'r')

buf = BytesIO()

data.save(buf, 'PNG')

r = requests.post(url=url, data=buf.getvalue(), headers=headers, auth=HTTPBasicAuth('user', 'user'))

Upvotes: 1

Related Questions