Reputation: 51
I'm trying to loop through this BeerAdvocate page (https://www.beeradvocate.com/beer/styles/35/) in order to scrape the beer name, abv, rating and so on. However, I'm not sure how to build a loop to go over the whole page.
For example, I'm doing as follows regarding the beer name:
import requests
from bs4 import BeautifulSoup
url = "https://www.beeradvocate.com/beer/styles/35/"
results = requests.get(url)
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
beer_name = []
beer_div = soup.find_all('div',id='ba-content')
for container in beer_div:
#beer name
name = container.find_all('a')[12].text
beer_name.append(name)
print(beer_name)
Does anyone know what I'm doing wrong here?
Thank you!
Upvotes: 4
Views: 227
Reputation: 195438
To correctly parse the HTML, use lxml
or html5lib
parser:
import requests
from bs4 import BeautifulSoup
url = 'https://www.beeradvocate.com/beer/styles/35/'
soup = BeautifulSoup( requests.get(url).content, 'lxml' )
print('{:<60} {:<70} {:<10} {:<10}'.format('Name', 'Brewery', 'ABV', 'Ratings', 'Avg'))
print('-' * 150)
for row in soup.select('tr:has(.hr_bottom_light)'):
tds = [td.text for td in row.select('td')]
print('{:<60} {:<70} {:<10} {:<10}'.format(*tds))
Prints:
Name Brewery ABV Ratings
------------------------------------------------------------------------------------------------------------------------------------------------------
Ayinger Celebrator Ayinger Privatbrauerei 6.70 6,843
Troegenator Tröegs Brewing Company 8.20 3,437
Spaten Optimator Spaten-Franziskaner-Bräu 7.60 3,218
Salvator Paulaner Brauerei 7.90 2,948
Korbinian Bayerische Staatsbrauerei Weihenstephan 7.40 2,883
Samichlaus Classic Bier Brauerei Schloss Eggenberg 14.00 2,182
Samuel Adams Double Bock (Imperial Series) Boston Beer Company (Samuel Adams) 9.50 1,947
Consecrator Bell's Brewery - Eccentric Café & General Store 8.00 1,534
Andechser Doppelbock Dunkel Klosterbrauerei Andechs 7.10 1,259
Birra Moretti La Rossa Birra Moretti (Heineken) 7.20 1,077
Perkulator Coffee Dopplebock Dark Horse Brewing Company 7.00 1,028
EKU 28 Kulmbacher Brauerei AG 11.00 742
Liberator Doppelbock Thomas Hooker Brewing Company 8.50 701
Augustiner Bräu Maximator Augustiner-Bräu 7.50 698
Smuttynose S'muttonator (Heritage Series) Smuttynose Brewing Company 8.20 653
Butthead Doppelbock Tommyknocker Brewery 8.20 606
Autumnal Fire Capital Brewery 7.80 604
Weltenburger Kloster Asam-Bock Klosterbrauerei Weltenburg 6.90 591
Wasatch The Devastator Double Bock Utah Brewers Cooperative 8.00 588
St. Victorious Victory Brewing Company - Downingtown 7.60 578
Urbock 23° Brauerei Schloss Eggenberg 9.60 460
Voodoovator Atwater Brewery 9.50 420
Saxonator Dunkles Doppelbock Jack's Abby Brewing 8.50 379
Doppel-Hirsch Der HirschBrau/Privatbrauerei Höss 7.20 379
Josephs Brau Winter Brew Trader Joe's Brewing Company 7.50 345
Duck-Rabbator The Duck-Rabbit Craft Brewery 8.50 344
Troegenator - Bourbon Barrel-Aged Tröegs Brewing Company 11.50 330
Ettaler Curator Dunkler Doppelbock (US Import Version) Klosterbrauerei Ettal / Ettaler Klosterbetriebe GmbH 9.00 314
Blonde Doppelbock Capital Brewery 7.80 295
Snow Blind Doppelbock Starr Hill Brewery 7.70 278
Doppelbock Dunkel Brauerei Schloss Eggenberg 8.50 264
Tucher Bajuvator Doppelbock Brauerei Tucher Brau 7.20 264
Dark Heathen Triple Bock Kuhnhenn Brewing Company 12.50 262
Winter Bock Gordon Biersch Brewery Restaurant 7.50 255
Deep Water Dopplebock Thomas Creek Brewery 7.00 232
Doppelbock Grande Cuvée Printemps Les Trois Mousquetaires 8.60 213
Lobotomy Bock Indian Wells Brewing Company 10.50 208
Sled Dog Dopplebock Wagner Valley Brewing Co. 8.50 201
Primátor Double Bock Beer Pivovar Náchod a.s. 10.50 189
Icelandic Doppelbock Einstök Ölgerð (Einstök Beer Company) 6.70 187
Dopple Bock Sprecher Brewing Company 7.85 186
St. Nikolaus Bock Bier - Brewer's Reserve Pennsylvania Brewing Company 8.50 185
Double Skull Epic Brewing Company 9.00 180
Emancipator Doppelbock Christian Moerlein Brewing Company 7.00 179
Winter-Bock Einbecker Brauhaus AG 7.50 169
Granitbock Brauerei Hofstetten Krammer GmbH & Co. KG 7.30 158
Henry's Farm Double Bock Two Roads Brewing Company 7.80 156
Double Vision Doppelbock Grand Teton Brewing Co. 8.00 156
Massacre Wolverine State Brewing Company 14.50 147
Fireman's Brew Brunette Beer Fireman's Brew, Inc. 8.00 143
Upvotes: 1
Reputation: 5531
First identify the table
, then find all the tr
tags within the table
, then loop through the tr
tags to print the text.
beer_table = soup.find('table')
tr_tags = beer_table.find_all('tr')[3:]
for tr in tr_tags:
beer_name.append(tr.td.text)
beer_name = beer_name[:-1]
print(beer_name)
Output:
['Ayinger Celebrator', 'Troegenator', 'Spaten Optimator', 'Salvator', 'Korbinian', 'Samichlaus Classic Bier', 'Samuel Adams Double Bock (Imperial Series)', 'Consecrator', 'Andechser Doppelbock Dunkel', 'Birra Moretti La Rossa', 'Perkulator Coffee Dopplebock', 'EKU 28', 'Liberator Doppelbock', 'Augustiner Bräu Maximator', "Smuttynose S'muttonator (Heritage Series)", 'Butthead Doppelbock', 'Autumnal Fire', 'Weltenburger Kloster Asam-Bock', 'Wasatch The Devastator Double Bock', 'St. Victorious', 'Urbock 23°', 'Voodoovator', 'Saxonator Dunkles Doppelbock', 'Doppel-Hirsch', 'Josephs Brau Winter Brew', 'Duck-Rabbator', 'Troegenator - Bourbon Barrel-Aged', 'Ettaler Curator Dunkler Doppelbock (US Import Version)', 'Blonde Doppelbock', 'Snow Blind Doppelbock', 'Doppelbock Dunkel', 'Tucher Bajuvator Doppelbock', 'Dark Heathen Triple Bock', 'Winter Bock', 'Deep Water Dopplebock', 'Doppelbock Grande Cuvée Printemps', 'Lobotomy Bock', 'Sled Dog Dopplebock', 'Primátor Double Bock Beer', 'Icelandic Doppelbock', 'Dopple Bock', "St. Nikolaus Bock Bier - Brewer's Reserve", 'Double Skull', 'Emancipator Doppelbock', 'Winter-Bock', 'Granitbock', "Henry's Farm Double Bock", 'Double Vision Doppelbock', 'Massacre', "Fireman's Brew Brunette Beer"]
Here is the full code:
import requests
from bs4 import BeautifulSoup
url = "https://www.beeradvocate.com/beer/styles/35/"
results = requests.get(url)
soup = BeautifulSoup(results.content, 'html.parser')
beer_name = []
beer_table = soup.find('table')
tr_tags = beer_table.find_all('tr')[3:]
for tr in tr_tags:
beer_name.append(tr.td.text)
beer_name = beer_name[:-1]
print(beer_name)
Hope that this helps!
Upvotes: 4