dStucky
dStucky

Reputation: 78

Python Web Scraper - limited results per page defined by page JavaScript

I'm having trouble getting the full results of searches on this website: https://www.gasbuddy.com/home?search=67401&fuel=1 This link is one of the search results I'm having trouble with. The problem is that it only displays the first 10 results (I know, that's a common issue that has been described in multiple threads on stackoverflow - but the solutions found elsewhere haven't worked here.) The page's html seems to be generated by a javascript function, which doesn't embed all of the results into the page. I've tried using a function to access the link provided in the "More [...] Gas Prices" button, but that doesn't yield the full results either. Is there a way to access this full list, or am I out of luck?

Here's the Python I'm using to get the information:

# Gets the prices from gasbuddy based on the zip code.
def get_prices(zip_code, store): 
    search = zip_code
    # Establishes the search params to be passed to the website.
    params ={'search': search, 'fuel': 1}
    # Contacts website and make the search.  
    r = requests.get('https://www.gasbuddy.com/home', params=params, cookies={'DISPLAYNUM': '100000000'}) 
    # Turn the results of the above into Beautiful Soup object.
    soup = BeautifulSoup(r.text, 'html.parser') 
    # Searches out the div that contains the gas station information.
    results = soup.findAll('div', {'class': 'styles__stationListItem___xKFP_'})

Upvotes: 1

Views: 350

Answers (1)

Lorenzo M
Lorenzo M

Reputation: 96

Use selenium. It's a little bit of work to set up, but it sounds like it's what you need.

Here I used it to click on a website's "show more" button. See more at my exact project.

from selenium import webdriver
url = 'https://www.gofundme.com/discover'
driver = webdriver.Chrome('C:/webdriver/chromedriver.exe')
driver.get(url)
for elem in driver.find_elements_by_link_text('Show all categories'):
        try:
            elem.click()
            print('Succesful click')
        except:
            print('Unsuccesful click')

source = driver.page_source

driver.close()

So basically you need to find the name of the element you need to click to show more info, or you need to use a webdriver to scroll down the webpage.

Upvotes: 2

Related Questions