Reputation: 171
I have this code:
import urllib
from bs4 import BeautifulSoup
url = "http://www.microsoft.com/en-us/download/confirmation.aspx?id=17851"
pageurl = urllib.urlopen(url)
soup = BeautifulSoup(pageurl)
for d in soup.select("p.start-download [href]"):
print d['href']
When I run this code,it give me many download link. How can I only take only one of the download link given?
Upvotes: 0
Views: 96
Reputation: 82450
If you use your given code, you will not be able to take hold of the links and use them. Use the following code instead:
import urllib
from bs4 import BeautifulSoup
url = "http://www.microsoft.com/en-us/download/confirmation.aspx?id=17851"
pageurl = urllib.urlopen(url)
soup = BeautifulSoup(pageurl)
urls = []
for d in soup.select("p.start-download [href]"):
urls.append(d.attrs['href'])
print urls[0]
If you use the above code, then you can use the links themselves in other parts of the program. You may also do this using a lit comprehension:
urls = [d['href'] for d in soup.select("p.start-download [href]")]
print urls[0]
You can then iterate over urls
to get the url you want, or just use an index to get your link. Either way, this is more flexible than just printing a link. For example if you did not want to full installation, and just wanted some other package or perhaps the package for XP instead of Vista, 7 and 8 (using your urls here as an example).
Upvotes: 2
Reputation: 28259
for d in soup.select("p.start-download [href]"):
print d['href']
break
will stop after the first link
Upvotes: 1