Reputation: 1659

how to get html text in <strong> tag using python

I have tried multiple methods to no avail.

I have this simple html that I want to extract the number 373 and then do some division.

<span id="ctl00_cph1_lblRecCount">Records Found: <strong> 373</strong></span>

I attempted to get the number with this python script below

import requests
from bs4 import BeautifulSoup
from selenium.webdriver.common.keys import Keys
from selenium import webdriver
import urllib3
import re




NSNpreviousAwardRef = "https://www.dibbs.bsm.dla.mil/Awards/AwdRecs.aspx?Category=nsn&TypeSrch=cq&Value="+NSN+"&Scope=all&Sort=nsn&EndDate=&StartDate=&lowCnt=&hiCnt="                   

                NSNdriver.get(NSNpreviousAwardRef)


                previousAwardSoup = BeautifulSoup(NSNdriver.page_source,"html5lib");

                            # parsing of table
                try:
                    totalPrevAward = previousAwardSoup.find("span", {"id": "ctl00_cph1_lblRecCount"}).strong.text
                    awardpagetotala = float(totalPrevAward) / (50)
                    awardpagetotal = math.ceil(awardpagetotala)+1
                    print(date)
                    print("total previous awards: "+ str(totalPrevAward))
                    print("page total : "+ str(awardpagetotal))
                except Exception as e:
                    print(e) 
                    continue

all I get is this error

'NoneType' object has no attribute 'strong'

I tried parse the html as lxml and still the same error. What am I doing wrongly and how can I fix it

Upvotes: 0

Answers (2)

Sowjanya R Bhat

Reputation: 1168

Print your previousAwardSoup and check if it has the span tag that you're searching for.

Upvotes: 0

Right leg

Reputation: 16700

The code to access the strong tag, soup.find("span").strong, is perfectly right. You can explicitly try it by putting that html line in a variable, and creating your BeautifulSoup object from that variable.

Now, the error clearly tells you that the span tag you're looking for does not exist. So here are some potential sources of the problem, off the top of my head:

Are you sure of the html input you feed into BeautifulSoup to create previousAwardSoup?
Are you sure that the id attribute is correct? More specifically, is it always the same and not randomized?

Upvotes: 1

how to get html text in &lt;strong&gt; tag using python

Answers (2)

Related Questions

how to get html text in <strong> tag using python