Python BeautifulSoup 'NavigableString' object has no attribute 'get_text'

Question

This might seem simple, however i couldn't get this to work. Just started to learn scraping recently and have encountered this problem. Tried the code in python REPL and it seems to be working, however not sure why when i coded it, it wouldn't work.

This is my code below btw. So what i'm trying to do is to extract out the article title, link and picture for my program and this is what i have below.

from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests
import json

beauty_result=[]

def scrape_b2():
    soup = BeautifulSoup(urlopen('https://www.instyle.com/beauty'), 'lxml')
    url = 'https://www.instyle.com'
    for article in soup.find_all('article',class_='component tile media image-top type-article'):
        for img in article.find_all('div',class_='component lazy-image thumbnail'):
            for a in article.find('h3'):
                beauty_result.append(json.dumps({
                    'title':a.get_text(strip=True),
                    'link':url+article.find('a')['href'],
                    'image':img.get('data-src')
                }))
    print(beauty_result)

if __name__ == '__main__':
    scrape_b2()

And this is the whole traceback of the error that I got:

D:\Coding\Python\webscrape env>python app.py
Traceback (most recent call last):
File "app.py", line 37, in  scrape_b2()
File "app.py", line 28, in scrape_b2 'title':a.get_text(strip=True),
File "D:\Coding\Tools\Anaconda3\envs\webscraper_practice\lib\site-packages\bs4\element.py", line 742, in getattr self.__class__.__name__, attr))
AttributeError: 'NavigableString' object has no attribute 'get_text'

This is what i solved it with:

def scrape_b2():
    soup = BeautifulSoup(urlopen('https://www.instyle.com/beauty'), 'lxml')
    url = 'https://www.instyle.com'
    for article in soup.find_all('article',class_='component tile media image-top type-article'):
        for img in article.find_all('div',class_='component lazy-image thumbnail'):
            h3 = article.find('h3')
            a_link = h3.find('a')
            beauty_result.append(json.dumps({
                'title': a_link.get_text(strip=True),
                'link': url + a_link.get('href'),
                'image': img.get('data-src')
                }))
    print(beauty_result)

Maaz · Accepted Answer

Your error is because you cannot use the get_text() method, which is specific to Bs4 object.

What you can do is:

h3 = article.find('h3')
a_link = h3.find('a')
beauty_result.append(json.dumps({
    'title': a_link.get_text(strip=True),
    'link': url + a_link.get('href'),
    'image': img.get('data-src')
     }))

The previous code replace the loop for a in article.find('h3'):

Python BeautifulSoup 'NavigableString' object has no attribute 'get_text'

Answers (2)

Related Questions

Python BeautifulSoup &#39;NavigableString&#39; object has no attribute &#39;get_text&#39;

Answers (2)

Related Questions

Python BeautifulSoup 'NavigableString' object has no attribute 'get_text'