Allexj
Allexj

Reputation: 1487

Create a script to catch links on a webpage with python 3

I have to catch all the links of the topics in this page: https://www.inforge.net/xi/forums/liste-proxy.1118/

I've tried with this script:

import urllib.request
from bs4 import BeautifulSoup

url = (urllib.request.urlopen("https://www.inforge.net/xi/forums/liste-proxy.1118/"))
soup = BeautifulSoup(url, "lxml")

for link in soup.find_all('a'):
    print(link.get('href'))

but it prints all the links of the page, and not just the links of the topics as I'd like to. could you suggest me the fast way to do it? I'm still a newbie, and i've started learning python recently.

Upvotes: 0

Views: 57

Answers (1)

Aran-Fey
Aran-Fey

Reputation: 43246

You can use BeautifulSoup to parse the HTML:

from bs4 import BeautifulSoup
from urllib2 import urlopen

url= 'https://www.inforge.net/xi/forums/liste-proxy.1118/'
soup= BeautifulSoup(urlopen(url))

Then find the links with

soup.find_all('a', {'class':'PreviewTooltip'})

Upvotes: 2

Related Questions