Reputation: 25
I have an issue where I use a url that ends such as T-shirts page
I am trying to scrape the product links off the pages. I have been trying for some time now, nothing is working yet. This is my current attempt after some Googling and reading the Playwright docs:
Website html:
<select id="prodPerPageSelTop">
<option value="24">
<option value="48">
<option value="72">
<option value="96">
<option value="All">
<select>
def playwright_get_soup(url, wait_after_page_load=None):
with sync_playwright() as this_playwright:
browser = this_playwright.chromium.launch()
page = browser.new_page()
start = time.perf_counter()
page.goto(url)
try:
page.wait_for_load_state("load")
if wait_after_page_load:
time.sleep(wait_after_page_load
products_on_page = page.querySelector('#prodPerPageSelTop').innerText()
page.waitForFunction("document.querySelector('#prodPerPageSelTop').innerText() !== '" + products_on_page + "'")
# attempt 1
page.click('#prodperpageselect').select_option('96')
# attempt 2
# products_on_page = page.querySelector('#prodperpageselect ').innerText()
# page.waitForFunction("document.querySelector('#prodPerPageSelTop').innerText() !== '" + products_on_page + "'")
# attempt 3
# new_selector = 'id=prodPerPageSelTop'
# page.waitForSelector(new_selector)
# handle = page.querySelector(new_selector)
# handle.selectOption({"value": "96"})
# attempt 4
# page.select_option('select#prodperpageselect', value='96')
time.sleep(15)
# try to wait
page.wait_for_selector('select#prodperpageselect option[value="96"]')
except:
pass
soup = BeautifulSoup(page.content(), "html.parser")
browser.close()
return soup
soup = playwright_get_soup("https://www.alphabroder.com/category/t-shirts")
def get_links(page_soup):
these_links = []
all_product_thumbnails = page_soup.find_all("div", class_="thumbnail")
for thumbnail in all_product_thumbnails:
a_tag = thumbnail.find("a")
link = a_tag["href"]
these_links.append(link)
return these_links
page_links = get_links(soup)
assert(len(page_links) == 96
As the page loads, it starts on 24 items, continues loading for 4-5 seconds, then flickers and the select option then changes from say 24 items to 96 items.
I was expecting wait_for_selector
to work. I also wait 15 seconds after the page loads, yet returns 24 items, not 96.
So far, I've also tried clicking the select option tag 4 different ways myself, and nothing has worked yet.
I did review similar questions that use Playwright. I'm trying to be more respectful on this site than I was when I was younger.
Any help appreciated, thank you
Upvotes: 2
Views: 69
Reputation: 25196
Even if your focus is to get the information with playwright
- Therefore, I would just like point out additionally that scraping the information can also be implemented quite simply using requests
and the endpoint via which the information is loaded:
import requests
page_num = 1
data = []
while True:
json_data = requests.get(f'https://www.alphabroder.com/cgi-bin/livewamus/wam_tmpl/catalog_browse.p?action=getProduct&content=json&page=catalog_browse&startpath=1017&getNumProd=true&sort=pl&sortdir=asc&pageNum={page_num}&prodPerPage=96&site=ABLive&layout=Responsive&nocache=62059').json()
data.extend(json_data.get('browseProd'))
if page_num < json_data.get('paging')[0].get('pgTotal'):
page_num = page_num+1
else:
break
data
[{'productID': 'G500', 'colorCode': '93', 'description': 'Gildan Adult Heavy Cotton\x99 T-Shirt', 'division': 'AB', 'prodCat': '130', 'mill': '07', 'prodImg': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_93_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_93_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'prodURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html', 'regPriceDisp': '$0.00', 'onSale': True, 'salePrice': 2.32512, 'salePriceDisp': '$2.33', 'colorCount': 75, 'colorCountLabel': 'Colors', 'sizeCount': 8, 'primePlus': 'primeplus', 'primePlusLogo': True, 'sizeLabel': ' S - 5XL', 'sizeLabelDesc': 'Sizes:', 'msrpPriceDesp': 'Starting At: Pricing upon request', 'salesRank': 1, 'primePlusHTML': "<img src='/img/primeplus_logo.png' alt='Prime Plus Logo' title='Prime Plus' border='0' height='24' class='primelogo'>", 'showPriceHTML': "<span class='browseSalePrice'> $2.33</span>", 'gaMktgMill': 'Gildan', 'gaMktgCategory': 'T-Shirts', 'gaCurrency': 'USD', 'gaList': 'Results from Search List', 'sustainLogo': True, 'sustainLogoHTML': "<img src='/img/leaf_logo.png' alt='Sustain Logo' title='Sustain' border='0' height='20' class='sustainlogo'>", 'colorURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=93', 'colorSwatch': [{'productID': 'G500', 'colorCode': '00', 'colorXref': 'White ', 'description': 'WHITE', 'hexColor': 'FFFFFF', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_00_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_00_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_00_g.jpg', 'sortOrder': 1, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=00', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '01', 'colorXref': 'Pink', 'description': 'AZALEA', 'hexColor': 'FF76A0', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_01_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_01_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_01_g.jpg', 'sortOrder': 2, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=01', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '05', 'colorXref': 'Yellow', 'description': 'YELLOW HAZE', 'hexColor': 'EEE8A0', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_05_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_05_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_05_g.jpg', 'sortOrder': 3, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=05', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '08', 'colorXref': 'Light Blue', 'description': 'INDIGO BLUE', 'hexColor': '34657f', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_08_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_08_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_08_g.jpg', 'sortOrder': 4, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=08', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '11', 'colorXref': 'Pink', 'description': 'LIGHT PINK', 'hexColor': 'FFE4E4', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_11_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_11_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_11_g.jpg', 'sortOrder': 5, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=11', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '12', 'colorXref': 'Orange', 'description': 'TANGERINE', 'hexColor': 'FF8A3D', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_12_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_12_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_12_g.jpg', 'sortOrder': 6, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=12', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '18', 'colorXref': 'Tan', 'description': 'SAND', 'hexColor': 'c5b9ac', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_18_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_18_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_18_g.jpg', 'sortOrder': 7, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=18', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '20', 'colorXref': 'Tan', 'description': 'NATURAL', 'hexColor': 'F3E4C4', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_20_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_20_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_20_g.jpg', 'sortOrder': 8, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=20', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '21', 'colorXref': 'Yellow', 'description': 'DAISY', 'hexColor': 'F9F46F', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_21_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_21_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_21_g.jpg', 'sortOrder': 9, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=21', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '25', 'colorXref': 'Orange', 'description': 'TEXAS ORANGE', 'hexColor': 'af5c37', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_25_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_25_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_25_g.jpg', 'sortOrder': 10, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=25', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '27', 'colorXref': 'Red', 'description': 'GARNET', 'hexColor': '8B0000', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_27_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_27_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_27_g.jpg', 'sortOrder': 11, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=27', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '29', 'colorXref': 'Yellow', 'description': 'OLD GOLD', 'hexColor': 'e0b06e', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_29_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_29_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_29_g.jpg', 'sortOrder': 12, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=29', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '30', 'colorXref': 'Pink', 'description': 'HELICONIA', 'hexColor': 'FF00FF', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_30_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_30_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_30_g.jpg', 'sortOrder': 13, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=30', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}, {'productID': 'G500', 'colorCode': '31', 'colorXref': 'Orange', 'description': 'TENNESSEE ORANGE', 'hexColor': 'EB9501', 'image': '<noscript><img src=\'https://www.alphabroder.com//prodimg/small/g500_31_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\'></noscript><img src=\'/img//lazy.png\' data-lazyload data-src=\'https://www.alphabroder.com//prodimg/small/g500_31_g.jpg\' alt=\'Gildan Adult Heavy Cotton\x99 T-Shirt\' onerror=\'$.wam.imgError(this,"small")\'>', 'imageURL': 'https://www.alphabroder.com//prodimg/small/g500_31_g.jpg', 'showMoreColors': True, 'sortOrder': 14, 'productURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html?color=31', 'mainProdURL': 'https://www.alphabroder.com/product/g500/gildan-adult-heavy-cotton-t-shirt.html'}]},...]
Upvotes: 3