Hadi Motlagh
Hadi Motlagh

Reputation: 11

web scraping python beautifulsoup, javascriot

I want to get the product names from this web address:'https://telenor.se/handla/mobiler/' I am using python and beautifulsoup

I tried this but it couldnt catch the product lists, it seems products that are in the list are not capturing by beautifulsoup

mobile_page_url='https://telenor.se/handla/mobiler/'
mobile_page_data=requests.get(mobile_page_url)
mobile_page_soup=BeautifulSoup(mobile_page_data.text)
mobile_page_soup=mobile_page_soup.select('div',{'class':'grid-items__item'})

Upvotes: 1

Views: 44

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195408

The data you see on the page is loaded from external URL via JavaScript. You can simulate this call with requests/json modules:

import re
import json
import requests
from bs4 import BeautifulSoup

url = "https://telenor.se/handla/mobiler/"
items_url = "https://telenor.se/service/product-grid/get-component-data/{}"

soup = BeautifulSoup(requests.get(url).content, "html.parser")

data = soup.select_one("#ProductGridPage")[":data"]
data = json.loads(re.search(r"\{.*\}", data).group(0))

currentPageId = data["currentPageId"]
data = requests.get(items_url.format(currentPageId)).json()

# uncomment to print all data:
# print(json.dumps(data, indent=4))

for i in data["productGridPageJsonViewModel"]["gridItems"]:
    print(i.get("name"))

Prints:

iPhone 14
Galaxy S22 Ultra
iPhone 14 Plus
iPhone 14 Pro
12T Pro
Phone (1)
12 Pro

...

Upvotes: 1

Related Questions