Mounarajan
Mounarajan

Reputation: 1437

Wget ends at ampersand(&) and skips eveything after that

Wget skips everything after ampersand (&). I tried escaping &, but it is not working

Code:

import threading
import urllib.request
import os
import re
import time
import json
import sys

def take():
    a = ["https://itunes.apple.com/us/genre/ios-games-action/id7001?mt=8&letter=A","https://itunes.apple.com/us/genre/ios-games-action/id7001?mt=8&letter=B"]
    for url_file in a:
        url_file = re.sub(r'\&','\&',url_file)
        data = os.popen('wget -qO- %s'% url_file).read()
        if re.search(r'(?mis)paginate\-more\">next',data):
            print ("hi")

take()

This should print "hi"

But since Wget skips everything after &, it is throwing blank output.

How could I make this work?

Upvotes: 0

Views: 1288

Answers (2)

soumen
soumen

Reputation: 66

Your code is working for me as it is. I am using Python 2.6.x on Linux.

The output is

hi
hi

I see that you have escaped '&' in your source.

Upvotes: 0

umläute
umläute

Reputation: 31334

The problem you are facing is that & has a special meaning in the shell (and you are calling a shell via popen): that is to background the job on the left-hand side of the ampersand.

To circumvent this, you have to escape the special characters, or use quotes around the URL:

 data = os.popen('wget -qO- "%s"' % url_file).read()

Upvotes: 1

Related Questions