kg5425
kg5425

Reputation: 469

Python requests .get() from multiple pages?

I'm learning how to webscrape with python and I'm wondering if it's possible to grab two pages with requests.get() so that I don't have to make two separate calls and variables. For example:

r1 = requests.get("page1")
r2 = requests.get("page2")

pg1 = BeautifulSoup(r1.content, "html.parser")
pg2 = BeautifulSoup(r2.content, "html.parser")

As you can see there's repeated code. Any way around this? Thanks!

Upvotes: 4

Views: 9580

Answers (2)

Benedict Randall Shaw
Benedict Randall Shaw

Reputation: 738

You can use list assignment and comprehensions, although it isn't much shorter with only two pages.

pg1, pg2 = [ BeautifulSoup(requests.get(page).content, "html.parser")
                for page in ["page1","page2"] ]

Upvotes: 6

Aaron Nelson
Aaron Nelson

Reputation: 191

I like the grequests library for fetching multiple URLS at one time, instead of requests. Especially when dealing with alot of URLS or a single URL with many sub-pages.

import grequests  
urls = ['http://google.com', 'http://yahoo.com', 'http://bing.com']  
unsent_request = (grequests.get(url) for url in urls)

results = grequests.map(unsent_request) 

After this, results can be processed however you need. This works well with JSON data: results[0] = first URL data, results[1] = second URL data, etc..

more can be found here

Upvotes: 10

Related Questions