Python Web Scraping Dynamic Content

Question

I've been trying to scrape kith.com search results but I get skeleton sample code. Tried to use scrapy, requests-html and selenium but I haven't managed to make them work.

Right now my code is:

from requests_html import HTMLSession

session = HTMLSession()
r = session.get("https://kith.com/pages/search-results-page?q=nike&tab=products&sort_by=created")

r.html.render()
print(r)

From what I've seen, render() should get the html code as it's seen in a browser but I still get the same "raw" code.

PD: kith.com is a shopify shop

Kasem Alsharaa · Accepted Answer

Selenium is suitable for a job like this

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
driver.get('https://kith.com/pages/search-results-page?q=nike&tab=products&sort_by=created')


item_titles = driver.find_elements_by_class_name("snize-title")

print item_titles[0].text
#NIKE WMNS SHOX TL - NOVA WHITE / TEAM ORANGE / SPRUCE AURA

Edit:

If you want to capture all item info, the div elements with snize-overhidden class will be what you want to capture. Then you may iterate through them and their sub elements

Python Web Scraping Dynamic Content

Answers (1)

Related Questions