Limit return size of GAE url_fetch get method?

Question

I'm trying to grab the Id3 info out of Mp3 files stored online without grabbing the whole file and from a lot of googling the best method seems to be grabbing the first couple kb of the file then getting it from that. Is there a way in googles app engine (python) to get just the start of a file from it's URL?

Something like

rpc.size_limit = 4096
rpc = urlfetch.create_rpc(deadline=10.0)
     urlfetch.make_fetch_call(rpc, url, method=method, headers=headers,
       payload=payload, allow_truncated=True)
return rpc

Thanks for any help in advance.

tomatosource · Accepted Answer

Found it! You can just put a range in the headers if the website accepts the header as follows

headers["Range"] = "bytes = 0-4096"

Or you can use the something like the following if the website doesnt like the range header (so far the few I've tried all have)

host = 'http://www.wikipedia.org/somepath/tosome/file.mp3'
req = urllib2.Request(host, headers={'User-Agent' : "Magic Browser"})
response = urllib2.urlopen(req).read(4*1024)

Hopefully this saves some time to someone in the future!

Limit return size of GAE url_fetch get method?

Answers (1)

Related Questions