Reputation: 72
I'm just trying to get data from a webpage called "Elgiganten" and its url: https://www.elgiganten.se/
I want to get the products name and its url. When I tried to get the a tag so I got an empty list, but I could get the span tag though taht they were in the same div tag.
Here is the whole code:
from bs4 import BeautifulSoup
import requests
respons = requests.get("https://www.elgiganten.se")
soup = BeautifulSoup(respons.content, "lxml")
g_data = soup.find_all("div", {"class": "col-flex S-order-1"})
for item in g_data:
print(item.contents[1].find_all("span")[0])
print(item.contents[1].find_all("a", {"class": "product-name"}))
I hope that anyone can tell me why the a tag seems to be invisible, and can fix the issue.
Upvotes: 0
Views: 64
Reputation: 154
If you wish to stick to the way you started, the following is how you can achieve that:
import requests
from bs4 import BeautifulSoup
respons = requests.get("https://www.elgiganten.se")
soup = BeautifulSoup(respons.text,"lxml")
for item in soup.find_all(class_="mini-product-content"):
product_name = item.find("span",class_="table-cell").text
product_link = item.find("a",class_="product-name").get("href")
print(product_name,product_link)
Upvotes: 1
Reputation: 9619
Go for the a-tags directly. You can extract the product name and the url both from that tag:
from bs4 import BeautifulSoup
import requests
respons = requests.get("https://www.elgiganten.se")
soup = BeautifulSoup(respons.content, "lxml")
g_data = soup.find_all("a", {"class": "product-name"}, href=True)
for item in g_data:
print(item['title'], item['href'])
Upvotes: 2