clifgray
clifgray

Reputation: 4419

Strange urllib2.urlopen() error with variable vs string

I am having some strange behavior while using urllib2 to open a URL and download a video.

I am trying to open a video resource and here is an example link:

https://zencoder-temp-storage-us-east-1.s3.amazonaws.com/o/20130723/b3ed92cc582885e27cb5c8d8b51b9956/b740dc57c2a44ea2dc2d940d93d772e2.mp4?AWSAccessKeyId=AKIAI456JQ76GBU7FECA&Signature=S3lvi9n9kHbarCw%2FUKOknfpkkkY%3D&Expires=1374639361

I have the following code:

        mp4_url = ''
        #response_body is a json response that I get the mp4_url from
        if response_body['outputs'][0]['label'] == 'mp4':
            mp4_url = response_body['outputs'][0]['url']

        if mp4_url:
            logging.info('this is the mp4_url')
            logging.info(mp4_url)

            #if I add the line directly below this then it works just fine
            mp4_url = 'https://zencoder-temp-storage-us-east-1.s3.amazonaws.com/o/20130723/b3ed92cc582885e27cb5c8d8b51b9956/b740dc57c2a44ea2dc2d940d93d772e2.mp4?AWSAccessKeyId=AKIAI456JQ76GBU7FECA&Signature=S3lvi9n9kHbarCw%2FUKOknfpkkkY%3D&Expires=1374639361'

            mp4_video = urllib2.urlopen(mp4_url)
            logging.info('succesfully opened the url')

The code works when I add the designated line but it gives me a HTTP Error 403: Forbidden message when I don't which makes me think it is messing up the mp4_url somehow. But the confusing part is that when I check the logging line for mp4_url it is exactly what I hardcoded in there. What could the difference be? Are there some characters in there that may be disrupting it? I have tried converting it to a string by doing:

mp4_video = urllib2.urlopen(str(mp4_url))

But that didn't do anything. Any ideas?

UPDATE:

With the suggestion to use print repr(mp4_url) it is giving me:

u'https://zencoder-temp-storage-us-east-1.s3.amazonaws.com/o/20130723/b3ed92cc582885e27cb5c8d8b51b9956/b740dc57c2a44ea2dc2d940d93d772e2.mp4?AWSAccessKeyId=AKIAI456JQ76GBU7FECA&Signature=S3lvi9n9kHbarCw%2FUKOknfpkkkY%3D&Expires=1374639361'

And I suppose the difference is what is causing the error but what would be the best way to parse this?

UPDATE II:

It ended up that I did need to cast it to a string but also the source that I was getting the link (an encoded video) needed nearly a 60 second delay before it could serve that URL so that is why it kept working when I hardcoded it because it had that delay. Thanks for the help!

Upvotes: 0

Views: 392

Answers (1)

Prahalad Deshpande
Prahalad Deshpande

Reputation: 4767

It would be better to simply dump the response obtained. This way you would be able to check what response_body['outputs'][0]['label'] evaluates to. In you case, you are initializing mp4_url to ''. This is not the same as None and hence the condition if mp4_url: will always be true.

You may want to check that the initial if statement where you check that response_body['outputs'][0]['label'] is correct.

Upvotes: 1

Related Questions