JonShish
JonShish

Reputation: 71

How to get around 'NoneType' object has no attribute 'text'

I'm working on an eBay Scraper, and I'm having some trouble with a simple "AttributeError: 'NoneType' object has no attribute 'text'"...

Here is my code

url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=2017+patrick+mahomes+psa+10+auto&_sacat=0&rt=nc&LH_Sold=1&LH_Complete=1&_ipg=100'

def get_data(url):
    r = requests.get(url)
    soup = bs(r.text, 'html.parser')
    return soup

def parse(soup):
    productslist = []
    results = soup.find_all('div', {'class': 's-item__info clearfix'})
    for item in results:
        product = {
            'title': item.find('h3', class_='s-item__title s-item__title--has-tags').text,
            'soldprice': float(item.find('span', class_='s-item__price').text.replace('$', '').replace(',','').strip()),
            'solddate': item.find('span', class_='s-item__title--tagblock__COMPLETED').find('span', class_='POSITIVE').text,
            'bids': item.find('span', class_='s-item__bids s-item__bidCount').text,
            'link': item.find('a', class_='s-item__link')['href'],
            }
        productslist.append(product)
    return productslist

def output(productslist):
    productsdf = pd.DataFrame(productslist)
    productsdf.to_csv('2017_Patrick_Mahomes_Rookies.csv', index=False)
    print('Saved to CSV')
    return

soup = get_data(url)
productslist = parse(soup)
output(productslist)
print(parse(soup))

I'm having trouble with the line.

'bids': item.find('span', class_='s-item__bids s-item__bidCount').text,

It is returning this error.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-48-0c9ceb760d75> in <module>
     27 
     28 soup = get_data(url)
---> 29 productslist = parse(soup)
     30 output(productslist)
     31 print(parse(soup))

<ipython-input-48-0c9ceb760d75> in parse(soup)
     14             'soldprice': float(item.find('span', class_='s-item__price').text.replace('$', '').replace(',','').strip()),
     15             'solddate': item.find('span', class_='s-item__title--tagblock__COMPLETED').find('span', class_='POSITIVE').text,
---> 16             'bids': item.find('span', class_='s-item__bids s-item__bidCount').text,
     17             'link': item.find('a', class_='s-item__link')['href'],
     18             }

AttributeError: 'NoneType' object has no attribute 'text'

When I remove .text, the code works and I get a csv file like this.

enter image description here

I want to keep all fields in the vertical "bids" column, strip away the span class html in most fields, and fill in the blank fields with "N/A". When I run 'try and except' statements, it deletes all the fields without "bids" (e.g. it keeps only fields with '5 bids and '89 bids', and deletes all others).

Still a beginner, so I apologize for the poor explanation.

Upvotes: 1

Views: 555

Answers (2)

yf879
yf879

Reputation: 168

Do this before Product.

try:
    bids = item.find('span', class_='s-item__bids s-item__bidCount').text
except:
    bids = ''

Replace bids = '' to bids = 'N/A' if you want to write N/A when bids are not available.

Update bids in Product,

'bids': bids,

OR Here is full code:

url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=2017+patrick+mahomes+psa+10+auto&_sacat=0&rt=nc&LH_Sold=1&LH_Complete=1&_ipg=100'

def get_data(url):
    r = requests.get(url)
    soup = bs(r.text, 'html.parser')
    return soup

def parse(soup):
    productslist = []
    results = soup.find_all('div', {'class': 's-item__info clearfix'})
    for item in results:
        
        try:
            bids = item.find('span', class_='s-item__bids s-item__bidCount').text
        except:
            bids = ''

        product = {
            'title': item.find('h3', class_='s-item__title s-item__title--has-tags').text,
            'soldprice': float(item.find('span', class_='s-item__price').text.replace('$', '').replace(',','').strip()),
            'solddate': item.find('span', class_='s-item__title--tagblock__COMPLETED').find('span', class_='POSITIVE').text,
            'bids': bids,
            'link': item.find('a', class_='s-item__link')['href'],
            }
        productslist.append(product)
    return productslist

def output(productslist):
    productsdf = pd.DataFrame(productslist)
    productsdf.to_csv('2017_Patrick_Mahomes_Rookies.csv', index=False)
    print('Saved to CSV')
    return

soup = get_data(url)
productslist = parse(soup)
output(productslist)
print(parse(soup))

Working for me, Output: enter image description here

Upvotes: 0

Yevhen Bondar
Yevhen Bondar

Reputation: 4707

You should check if tag s-item__bids s-item__bidCount exists like this

bids_tag = item.find('span', class_='s-item__bids s-item__bidCount')
if bids_tag:
    bids = bids_tag.text
else:
    bids = ''

And then

'bids': item.find('span', class_='s-item__bids s-item__bidCount').text,

Also you can check my library https://github.com/eugen1j/beautifulsoup4-helpers

Here code using this library

'bids': select_text_one(item, 'span.s-item__bids s-item__bidCount')

Upvotes: 2

Related Questions