Reputation: 5
I do not want to advertise any product.
But the error is very specific and I do not know how ask otherwise.
I want to get the links in the menu on the page A, which is in the code, but that page has another page associated, B
when I read the menu, it take the menu from page B, I do not understand why.
In the html, I see that all functions and libraries are in domain's page B.
Any suggestions?
from bs4 import BeautifulSoup
import http.cookiejar, urllib.request
mainurl="http://uk.example.com"
cookijar = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cookijar))
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
mainPage = opener.open(mainurl)
mainPageRequest = mainPage.read()
mainPagesoup = BeautifulSoup(mainPageRequest)
menu=mainPagesoup.find("div", { "class" : "mainNavigation_linkList_content" })
print(menu)
I want http://uk.example.com and the program read http://uk.example.co.uk/ menu
Upvotes: 0
Views: 138
Reputation: 3059
urllib doesn't seem to handle the redirects the way the server is expecting.
First install requests:
pip install requests
Then try this:
import requests
from bs4 import BeautifulSoup
s = requests.Session()
mainPage = s.get("http://uk.accessorize.com")
mainPagesoup = BeautifulSoup(mainPage.text)
menu=mainPagesoup.find("div", { "class" : "mainNavigation_linkList_content" })
print(menu)
Upvotes: 1