user3415869
user3415869

Reputation: 91

Extracting text and links from the html is not working with bs4

I am struggling to get the wikipedia.com and the name "John Martin" in the above text via bs4. I am new to bs4.

<div class="section" qualifer="allnames">
  <div class="container container-2">
   <div class="title">
     <h1 class="title1">
       This is a test
     </h1>
   </div>
   <div class="tile3">
     <a class="title4" href="wikipedia.com" title="John Martin">

I tried this

link = soup.find('div', class_='title4')
link = link.a.text()
print(link)

Can someone help? How do I get the links and the names from the above code please?

Upvotes: 1

Views: 32

Answers (1)

Jack Fleeting
Jack Fleeting

Reputation: 24930

You're almost there. Try:

link = soup.find_all('a', class_='title4')
for l in link:
    print(l['title'])
    print(l['href'])

Output:

John Martin

wikipedia.com

Upvotes: 1

Related Questions