Reputation: 87
Imports
import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import requests
from time import sleep
Open the page
driver = webdriver.Chrome()
main_url = 'https://www.samsung.com/ph/storelocator/'
driver.get(main_url)
driver.execute_script("window.scrollTo(0, 500)")
sleep(1)
driver.find_element_by_class_name('cm-cookie-geo__close-cta').click()
If I just GET the Request URL shown by the red arrow and replace the parameters with my desired parameters (change nradius=7), plain HTML is returned.
How can I get it to instead update the listing on the left panel like it would if I click the 10km button (except for 7km)?
I have tried using cookies as suggested here like this (without success):
# storing the cookies generated by the browser
request_cookies_browser = driver.get_cookies()
params = {
'nRadius': 7,
'latitude': 14.607538,
'longitude': 121.020967,
'searchFlag': 'search',
'modelCode': '',
'categorySubTypeCode': '',
'localSearchCallYn': 'N'
}
s = requests.Session()
# passing the cookies generated from the browser to the session
c = [s.cookies.set(c['name'], c['value']) for c in request_cookies_browser]
resp = s.post(main_url, params) # I get a 200 status_code
# passing the cookie of the response to the browser
dict_resp_cookies = resp.cookies.get_dict()
response_cookies_browser = [{'name':name, 'value':value} for name, value in dict_resp_cookies.items()]
c = [driver.add_cookie(c) for c in response_cookies_browser]
driver.get(main_url)
Edit 1: I am trying to get the latitude and longitude which aren't available through that GET url. It can be found on the main page with
soup = BeautifulSoup(driver.page_source, 'lxml')
latitude = soup.find('ul', {'id':'store-list'}).find_all('li').find('input', {'class':'lat','type':'hidden'})['value']
Upvotes: 0
Views: 852
Reputation: 1191
You can make a simple get request with requests and then parse with beautiful soup. The reason your code in the edit isn't working is because the html through the get request is formatted differently. The following worked for me.
import requests
from bs4 import BeautifulSoup
params = {
'nRadius': 7,
'latitude': 14.601026,
'longitude': 120.984192,
'searchFlag': 'search',
'modelCode': None,
'categorySubTypeCode': None,
'localSearchCallYn': 'N',
}
url = 'https://www.samsung.com/ph/storelocator/_jcr_content/par.cm-g-store-locator-storelist/'
r = requests.get(url, params=params)
soup = BeautifulSoup(r.text, 'html.parser')
for item_holder in soup.find_all('li'):
name = item_holder.find('h2', {'class': 'store-name'}).text
lat = item_holder.find('input', {'class': 'lat', 'type': 'hidden'})['value']
long = item_holder.find('input', {'class': 'long', 'type': 'hidden'})['value']
print('\n' + name)
print(lat, long)
WESTERN APPLIANCE - RECTO
14.604366 120.97991
ANSONS - BINONDO
14.6015268 120.97605479999993
SM APPLIANCE CENTER INC. - LUCKY CHINA TOWN
14.6031205 120.9741785
SM APPLIANCE CENTER INC. - MANILA
14.5904064 120.9830574
Upvotes: 1
Reputation: 62
Looking at the page, it seems like you may be better off scraping the html for elements where the distance attribute is less than or equal to 7. This is because it seems like the website only has specific parameters for nradius when returning a search of stores on the map (ie. only allows 1, 2, 5, and 10 km).
The way that it works is it finds your location and finds all locations less than 10 km (regardless of what distance you have selected). It then displays the locations on the map based off of the distance you have selected (nradius' provided). All of the stores <10km away are still listed in the html though.
However I've never done exactly what you're doing, so it could be something else. If you think that it is passing the cookies/headers between selenium and requests that is messing you up, you should check out the selenium-requests python package, which was developed to automatically handle needed cookie and request headers.
Good Luck!
Upvotes: 1