jjnair
jjnair

Reputation: 25

BeautifulSoup-tag returns none ,while there are elements in it

Trying to scrape the below website.Upon,extracting href from a-tags,it returns none.But,there are elements in those a tags.Here is the code:

from selenium import webdriver
import time
from bs4 import BeautifulSoup


driver = webdriver.Chrome(r"E:\chromedriver_win32\chromedriver.exe")
url= "https://www.adb.org/projects/tenders/sector/information-and-communication-technology-1066"
driver.get(url)

content = driver.page_source.encode('utf-8').strip()
soup = BeautifulSoup(content,"html.parser")
div_tags = soup.findAll("div",{"class":"item-title"})
for tags in div_tags:
    a_tag=tags.find('a')
    link=a_tag.get('href')
    print(link)

div_tags-output printed:

[<div class="item-title">
<a href="/node/606921"><div class="item-title">Corporate Innovation Expert</div></a></div>, <div class="item-title">Corporate Innovation Expert</div>, <div class="item-title">
<a href="/node/605571"><div class="item-title">Communications Specialist</div></a></div>, <div class="item-title">Communications Specialist</div>, <div class="item-title">
<a href="/node/603231"><div class="item-title">Venture Partner Expert</div></a></div>, <div class="item-title">Venture Partner Expert</div>, <div class="item-title">
<a href="/node/457636"><div class="item-title">Partnerships &amp; Communications Manager</div></a></div>, <div class="item-title">Partnerships &amp; Communications Manager</div>, <div class="item-title">
<a href="/node/545151"><div class="item-title">Operations Associate</div></a></div>, <div class="item-title">Operations Associate</div>, <div class="item-title">

Upvotes: 0

Views: 385

Answers (1)

Osadhi Virochana
Osadhi Virochana

Reputation: 1302

Here is the code:

from requests import get
import time
from bs4 import BeautifulSoup

response = get('https://www.adb.org/projects/tenders/sector/information-and-communication-technology-1066') 
soup = BeautifulSoup(response.text, 'html.parser')

div_tags = soup.findAll("div",class_="item-title")
link={}
for tags in div_tags:
    a_tag=tags.find('a')
    try:
      link[a_tag.text]= "https://www.adb.org"+a_tag.get('href')
    except:
      continue
#print(link)# print all text and value as dict
for x, y in link.items():
  print(x, y) # x print text y print link

If you want text and links as a dictionary, use print(link). If you want text, use print(x) in for loop. If you want links, use print(y) in for loop.

Run Online:https://repl.it/repls/ScaredCommonParallelcomputing

Upvotes: 2

Related Questions