How to loop through 100 Url and extract info from each one with selenium

I have this list, what is the best to go to extract a piece of info from each one and store this info into another list consider the wanted info <a>hello world</a>

def pagination():
    pagination = range(1, 100)
    for p in pagination:
        page = f"https://www.xx.xx{p}"

Upvotes: 0

Answers (1)

αԋɱҽԃ αмєяιcαη

Reputation: 11515

Since you are dealing with a single host. so you have to maintain the session object firstly so you will avoid to be blocked or flagged by almost of sites firewalls as DDOS-Attack, where you actually maintain the same TCP socket stream without keep open/close/reopen the socket.

After that you can loop over your desired pagination parameter and extract the title.

Below is an example for that.

import requests
from bs4 import BeautifulSoup


def main(url):
    with requests.Session() as req:
        for page in range(1, 11):
            r = req.get(url.format(page))
            soup = BeautifulSoup(r.content, 'html.parser')
            print(soup.title.text)


main("https://www.example.com/page={}")

Upvotes: 1

How to loop through 100 Url and extract info from each one with selenium

Answers (1)

Related Questions