Amanda Maull
Amanda Maull

Reputation: 13

Python extract empty tag with beautifulsoup

I have the following loop, which works to extract particular tags and input them into a .csv file for all files in a directory. However, some files have empty tags and I obtain the following error

Traceback (most recent call last):  
  File "newsbank2.py", line 27, in <module>  
    author = fauthor.text  
AttributeError: 'NoneType' object has no attribute 'text'

How can I just input a blank into the csv file for these cases. My code is below.

path = "my directory"

for filename in os.listdir(path):

    if filename.endswith('.htm'):
        fname = os.path.join(path,filename)
        with open(fname, 'r') as f:
            soup = BeautifulSoup(f.read(),'html.parser')
            ftitle = soup.find("div", class_="title")
            title = ftitle.text
            fsource = soup.find("div", class_="source")
            source = fsource.text
            source = source.replace("Browse Issues", " ")
            publication = source.split("-")[0].strip()
            fauthor = soup.find("li", class_="author first")
            author = fauthor.text
            fbody = soup.find("div", class_="body")
            body = fbody.text
            f = csv.writer(open("testcsv","a"))
            f.writerow([title, source, author, body])

Upvotes: 1

Views: 634

Answers (1)

G&#252;nther Jena
G&#252;nther Jena

Reputation: 3776

Use something like this:

title = ftitle.text if hasattr(ftitle, 'text') else ''

or following should also work:

title = ftitle.text if ftitle else ''

Upvotes: 1

Related Questions