Reputation: 1635
From this URL: http://vs-web-fs-1.oecd.org/piaac/puf-data/CSV
I want to download all the files and save them with the text of the anchor tag. I guess my main struggle is to retrieve the text of the anchor tag right now:
from bs4 import BeautifulSoup
import requests
import urllib.request
url_base = "http://vs-web-fs-1.oecd.org"
url_dir = "http://vs-web-fs-1.oecd.org/piaac/puf-data/CSV"
r = requests.get(url_dir)
data = r.text
soup = BeautifulSoup(data,features="html5lib")
for link in soup.find_all('a'):
if link.get('href').endswith(".csv"):
print(link.find("a"))
urllib.request.urlretrieve(url_base+link.get('href'), "test.csv")
Line print(link.find("a"))
returns None
. How can I retrieve the text?
Upvotes: 0
Views: 972
Reputation: 854
You get the text accessing the content, like this:
link.contents[0]
or
link.string
Upvotes: 1