Reputation: 574
Was trying to get links from this website. But noticed the links that I get from parsing are different from the ones that are showing on my browser. There aren't any missing links because both the browser and results from parsing show 14 hyperlinks(for series). But my browser shows some link which my "result" don't have and my "result" shows some link which my browser doesn't have.
For example my results show a link like
"https://4anime.to/anime/one-piece-nenmatsu-tokubetsu-kikaku-mugiwara-no-luffy-oyabun-torimonochou"
but when i searched for the word "torimonochou" in the browser I could not find any match.
Searched for the link in page source(right clicked the page and selected view page source) so i should not be missing anything. Also passed my browser's header in requests.get() so I should be getting the same HTML code.
The code :
head = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:79.0) Gecko/20100101 Firefox/79.0'}
searchResObj = requests.get("https://4anime.to/?s=one+piece", headers = head)
soupObj = bs4.BeautifulSoup(searchResObj.text, features="html.parser")
Tried all kinds of different approach to parse links. This is just a simplified version which fetches all the links in the page so I am not missing any.
all_a = soupObj.select("a")
for links in all_a:
print(links.get("href"))
Also viewed the HTML code from my compiler. The hyperlinks are indeed different than the ones showing in my browser
print(searchResObj.text)
So what might be causing this?
Upvotes: 0
Views: 619
Reputation: 195418
Running this script will print 14 links which show in browser too (maybe you've got Captcha page?):
import requests
from bs4 import BeautifulSoup
searchResObj = requests.get("https://4anime.to/?s=one+piece")
soupObj = BeautifulSoup(searchResObj.text, features="html.parser")
for a in soupObj.select('#headerDIV_95 > a'):
print(a['href'])
Prints:
https://4anime.to/anime/one-piece-nenmatsu-tokubetsu-kikaku-mugiwara-no-luffy-oyabun-torimonochou
https://4anime.to/anime/one-piece-straw-hat-theater
https://4anime.to/anime/one-piece-movie-14-stampede
https://4anime.to/anime/one-piece-yume-no-soccer-ou
https://4anime.to/anime/one-piece-mezase-kaizoku-yakyuu-ou
https://4anime.to/anime/one-piece-umi-no-heso-no-daibouken-hen
https://4anime.to/anime/one-piece-film-gold
https://4anime.to/anime/one-piece-heart-of-gold
https://4anime.to/anime/one-piece-episode-of-sorajima
https://4anime.to/anime/one-piece-episode-of-sabo
https://4anime.to/anime/one-piece-episode-of-nami
https://4anime.to/anime/one-piece-episode-of-merry
https://4anime.to/anime/one-piece-episode-of-luffy
https://4anime.to/anime/one-piece-episode-of-east-blue
EDIT: Screenshot from "View Source Code":
Upvotes: 1