Reputation: 21
Very new to python and struggling with this loop. I'm trying to pull the html attribute data address from a list of static pages that i already have in list format. I've managed to use BS4 to pull the data from one page but I cannot get the loop correct to iterate through my list of URLs. Right now I am receiving this error (Invalid URL '0': No schema supplied. Perhaps you meant http://0?) but I checked the URLs in single pulls and they all work. Here is my working single pull code:
import requests
from bs4 import BeautifulSoup
result = requests.get('https://www.coingecko.com/en/coins/0xcharts')
src = result.content
soup = BeautifulSoup(src, 'lxml')
contract_address = soup.find(
'i', attrs={'data-title': 'Click to copy'})
print(contract_address.attrs['data-address'])
This is the loop I am working on:
import requests
from bs4 import BeautifulSoup
url_list = ['https://www.coingecko.com/en/coins/2goshi','https://www.coingecko.com/en/coins/0xcharts']
for link in range(len(url_list)):
result = requests.get(link)
src = result.content
soup = BeautifulSoup(src, 'lxml')
contract_address = soup.find(
'i', attrs={'data-title': 'Click to copy'})
print(contract_address.attrs['data-address'])
url_list.seek(0)
Upvotes: 2
Views: 4389
Reputation: 20008
You have misunderstood the usage of range()
. Please read the docs.
When you do:
result = requests.get(link)
link
is a an int
value coming from range()
, see what happens when you print(link)
.
Instead, access the list url_list
as follows:
result = requests.get(url_list[link])
Here's a full example:
import requests
from bs4 import BeautifulSoup
url_list = ['https://www.coingecko.com/en/coins/2goshi','https://www.coingecko.com/en/coins/0xcharts']
for link in range(len(url_list)):
result = requests.get(url_list[link])
src = result.content
soup = BeautifulSoup(src, 'lxml')
contract_address = soup.find(
'i', attrs={'data-title': 'Click to copy'})
print(contract_address.attrs['data-address'])
Output:
0x70e132641d6f1bd787b119a289fee544fbb2f316
0x86dd49963fe91f0e5bc95d171ff27ea996c0890c
Upvotes: 1
Reputation: 368
Try that.
import requests
from bs4 import BeautifulSoup
url_list = ['https://www.coingecko.com/en/coins/2goshi','https://www.coingecko.com/en/coins/0xcharts']
for link in url_list:
result = requests.get(link)
src = result.content
soup = BeautifulSoup(src, 'lxml')
contract_address = soup.find(
'i', attrs={'data-title': 'Click to copy'})
print(contract_address.attrs['data-address'])
url_list.seek(0)
Upvotes: 1