Reputation: 462
I've been trying to grasp web scraping as a small project. I'm trying to access products on a webpage and print out the amount of times they've been sold. My code:
from bs4 import BeautifulSoup as bs
import requests as req
SEARCH_QUERY = 'swimsuit'
url = f'https://www.aliexpress.com/premium/swimsuit.html?ltype=premium&d=y&CatId=0&SearchText='\
f'{SEARCH_QUERY}&trafficChannel=ppc&SortType=default&page=2'
original_website = req.get(url)
source = original_website.content
soup = bs(source, 'lxml')
links = soup.find_all()
for link in links:
print(link.get('sale-value'))
So, I looked at the website and the information I want is deep within the HTML, under a tag called sale-value. When I run the code, all that gets printed is a sequence of None
. I believe the scraping is under the wrong webpage, probably the default page. Any help would be appreciated!
Printing the source gives me:
Upvotes: 0
Views: 231
Reputation: 11
I've got a nagging suspicion it has got to do with AliExpress throwing you out onto the login page every time you try searching for a particular product or typing a query directly into the address bar instead of following the menu links. Perhaps, Selenium would be a better choice for the task
Upvotes: 1