Python: Simple Web Crawler using BeautifulSoup4

Question

I have been following TheNewBoston's Python 3.4 tutorials that use Pycharm, and am currently on the tutorial on how to create a web crawler. I Simply want to download all of XKCD's Comics. Using the archive that seemed very easy. Here is my code, followed by TheNewBoston's. Whenever I run the code, nothing happens. It runs through and says, "Process finished with exit code 0" Where did I screw up?
TheNewBoston's Tutorial is a little dated, and the website used for the crawl has changed domains. I will comment the part of the video that seems to matter.

My code:

mport requests
from urllib import request
from bs4 import BeautifulSoup

def download_img(image_url, page):
    name = str(page) + ".jpg"
    request.urlretrieve(image_url, name)


def xkcd_spirder(max_pages):
    page = 1
    while page <= max_pages:
        url = r'http://xkcd.com/' + str(page)
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text, "html.parser")
        for link in soup.findAll('div', {'img': 'src'}):
            href = link.get('href')
            print(href)
            download_img(href, page)
        page += 1

xkcd_spirder(5)

Python: Simple Web Crawler using BeautifulSoup4

Answers (1)

Related Questions