How to grab the output from python subprocess

Question

I am executing the python script from commandline with this

python myscript.py

This is my script

if item['image_urls']:
            for image_url in item['image_urls']:
            subprocess.call(['wget','-nH', image_url, '-P  images/'])

Now when i run that the on the screen i see output like this

HTTP request sent, awaiting response... 200 OK
Length: 4159 (4.1K) [image/png]

now what i want is that there should be no ouput on the terminal.

i want to grab the ouput and find the image extension from there i.e from [image/png] grab the png and renaqme the file to something.png

Is this possible

Chunliang Lyu · Accepted Answer

If all you want is to download something using wget, why not try urllib.urlretrieve in standard python library?

import os
import urllib
image_url = "https://www.google.com/images/srpr/logo3w.png"
image_filename = os.path.basename(image_url)
urllib.urlretrieve(image_url, image_filename)

EDIT: If the images are dynamically redirected by a script, you may try requests package to handle the redirection.

import requests
r = requests.get(image_url)
# here r.url will return the redirected true image url
image_filename = os.path.basename(r.url)
f = open(image_filename, 'wb')
f.write(r.content)
f.close()

I haven't test the code since I do not find a suitable test case. One big advantage for requests is it can also handle authorization.

EDIT2: If the image is dynamically served by a script, like gravatar image, you can usually find the filename in the response header's content-disposition field.

import urllib2
url = "http://www.gravatar.com/avatar/92fb4563ddc5ceeaa8b19b60a7a172f4"
req = urllib2.Request(url)
r = urllib2.urlopen(req)
# you can check the returned header and find where the filename is loacated
print r.headers.dict
s = r.headers.getheader('content-disposition')
# just parse the filename
filename = s[s.index('"')+1:s.rindex('"')]
f = open(filename, 'wb')
f.write(r.read())
f.close()

EDIT3: As @Alex suggested in the comment, you may need to sanitize the encoded filename in the returned header, I think just get the basename is ok.

import os
# this will remove the dir path in the filename
# so that `../../../etc/passwd` will become `passwd`
filename = os.path.basename(filename)

How to grab the output from python subprocess

Answers (1)

Related Questions