Johnathan Sandifer
Johnathan Sandifer

Reputation: 9

Web Scraping Prices with Python

I was following an online tutorial at the following webpage, https://www.youtube.com/watch?v=nCuPv3tf2Hg&list=PLRzwgpycm-Fio7EyivRKOBN4D3tfQ_rpu&index=1. I have no idea what I am doing wrong. I have tried the code in both Visual Studio and Jupyter notebooks to no avail.

Code:

import requests
from bs4 import BeautifulSoup as bs

bURL = 'https://www.thewhiskyexchange.com/c/540/taiwanese-whisky'

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}
r = requests.get('https://www.thewhiskyexchange.com/c/540/taiwanese-whisky')

soup = bs(r.content, 'lxml')

productlist = soup.find_all('div', class_='item')
productlinks = []

for item in productlist:
    for link in item.find_all('a', href=True):
        print(link['href'])

Upvotes: 1

Views: 267

Answers (1)

childnick
childnick

Reputation: 1411

The structure of that website has changed since the video was posted.

I've fixed your code below:

import requests 
from bs4 import BeautifulSoup as bs

bURL = 'https://www.thewhiskyexchange.com/c/540/taiwanese-whisky'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'} 

r = requests.get(bURL, headers=headers)

soup = bs(r.text, 'html.parser')

for x in soup.find_all('li', {'class':'product-grid__item'}):
    link = x.find('a')
    print(x.text, 'https://www.thewhiskyexchange.com'+link['href'])

Upvotes: 2

Related Questions