How to pull the same nested data from a list of URLs using BeautifulSoup

Question

Good Afternoon,

I'm relatively new to scraping and i'm currently caught up with this one project. The intended data to be pulled is the company name, address, phone number and company url (all pulled from the nested web page).

Main Page = http://www.therentalshow.com/find-exhibitors/sb-search/equipment/sb-inst/8678/sb-logid/242109-dcja1tszmylg308y/sb-page/1 Nested Page = http://www.therentalshow.com/exhibitor-detail/cid/45794/exhib/2019

I was able to compile this list of urls but I'm having the hardest time scraping each individual company information and outputting to a CSV in table format.

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import requests
import pandas as pd
import csv, os

my_url = 'http://www.therentalshow.com/find-exhibitors/sb-search/equipment/sb-inst/8678/sb-logid/242109-dcja1tszmylg308y/sb-page/1'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, 'lxml')

#create list of urls from main page
urls = []
tags = page_soup.find_all('a',{'class':'avtsb_title'})
for tag in tags:
    urls.append('http://www.therentalshow.com' + tag.get('href'))

#iterate through each page to return company data
for url in urls:
    site = uReq(url)
    soups = soup(site, 'lxml')

    name = page_soup.select('h2')
    address = page_soup.find('span',{'id':'dnn_ctr8700_TRSExhibitorDetail_lblAddress'})
    city = page_soup.find('span',{'id':'dnn_ctr8700_TRSExhibitorDetail_lblCityStateZip'})
    phone = page_soup.find('span',{'id':'dnn_ctr8700_TRSExhibitorDetail_lblPhone'})
    website = page_soup.find('a',{'id':'dnn_ctr8700_TRSExhibitorDetail_hlURL'})

    os.getcwd()
    outputFile = open('output2.csv', 'a', newline='')
    outputWriter = csv.writer(outputFile)
    outputWriter.writerow([name, address, city, phone, website])

My returned output is

[],,,,
[],,,,

99 lines in total. My total list of links is 100.

I would like the names of the aforementioned variables as headers to my csv file but my current output is not what i'm looking for. I'm quite lost so ANY help at all would be so greatly appreciated. Thank you!

How to pull the same nested data from a list of URLs using BeautifulSoup

Answers (1)

Related Questions