Lucas
Lucas

Reputation: 521

Python BeautifulSoup Extracting Data From Header

This is a follow-up from another question. Thanks for the help so far.

I've got some code to loop through a page and create a dataframe. I'm trying to add a third piece of information but it is contained within the header so it's just returning blank. The level information contained in the td and h3 part of the code. It returns the error "AttributeError: 'NoneType' object has no attribute 'text'" If I change level.h3.text to level.h3 it will run but then it will have the full tags in the data frame, instead of just the number.

import urllib
import bs4 as bs
import pandas as pd
#import csv as csv

sauce = urllib.request.urlopen('https://us.diablo3.com/en/item/helm/').read()
soup = bs.BeautifulSoup(sauce, 'lxml')

item_details =  soup.find('tbody')

names = item_details.find_all('div', class_='item-details')
types = item_details.find_all('ul', class_='item-type')
#levels = item_details.find_all('h3', class_='subheader-3')
levels = item_details.find_all('td', class_='column-level align-center')
print(levels)

mytable = []



for name, type, level in zip(names, types, levels):
    mytable.append((name.h3.a.text, type.span.text, level.h3.text))



export = pd.DataFrame(mytable, columns=('Item', 'Type','Level'))

Upvotes: 2

Views: 433

Answers (1)

Andersson
Andersson

Reputation: 52685

Try to modify your code as below:

for name, type, level in zip(names, types, levels):
    mytable.append((name.h3.a.text, type.span.text, level.h3.text if level.h3 else "No level"))

Now "No level" (you can use "N/A", None or whatever you like the most) will be added as third value in case there is no level (no header)

Upvotes: 2

Related Questions