Use BeautifulSoup to loop through and retrieve specific URLs

Question

I want to use BeautifulSoup and retrieve specific URLs at specific position repeatedly. You may imagine that there are 4 different URL lists each containing 100 different URL links.

I need to get and print always the 3rd URL on every list, while the previous URL (e.g. the 3rd URL on the first list) will lead to the 2nd list (and then need to get and print the 3rd URL and so on till the 4th retrieval).

Yet, my loop only achieves the first result (3rd URL on list 1), and I don't know how to loop the new URL back to the while loop and continue the process.

Here is my code:

import urllib.request
import json
import ssl
from bs4 import BeautifulSoup


num=int(input('enter count times: ' ))
position=int(input('enter position: ' ))

url='https://pr4e.dr-chuck.com/tsugi/mod/python-   
data/data/known_by_Fikret.html'
print (url)

count=0
order=0
while count

alecxe · Accepted Answer

Just get the link from find_all() by index:

while count < num:
    context = ssl._create_unverified_context()
    htm = urllib.request.urlopen(url, context=context).read()

    soup = BeautifulSoup(htm)
    url = soup.find_all('a')[position].get('href')

    count += 1

Use BeautifulSoup to loop through and retrieve specific URLs

Answers (2)

Related Questions