get text content from p tag

Question

I am trying to get description text content of each block on this page

https://twitter.com/search?q=data%20mining&src=typd&vertical=default&f=users.

html for p tag looks like

http://DataMiningBlog.com  covers current challenges, interviews with leading actors and book reviews related to data mining, analytics and data science.

my code:

productDivs = soup.findAll('div', attrs={'class' : 'ProfileCard-content'})
for div in productDivs:
   print div.find('p', attrs={'class' : 'ProfileCard-bio u-dir'}).text

anything wrong here? Getting exception here

Traceback (most recent call last):
  File "twitter_user_scrapper.py", line 91, in getImageList
    print div.find('p', attrs={'class' : 'ProfileCard-bio u-dir'}).text
AttributeError: 'NoneType' object has no attribute 'text'

Anand S Kumar · Accepted Answer

The issue might be that some div with class as ProfileCard-content may not have a child p element with class - ProfileCard-bio u-dir , when that happens , the following returns None -

div.find('p', attrs={'class' : ['ProfileCard-bio', 'u-dir']})

And that is the reason you are getting the AttributeError. You should get the return of above and save it in a variable , and check whether its None or not and take the text only if its not None.

Also, you should give class as a list of all the classes , not a single string, as -

attrs={'class' : ['ProfileCard-bio', 'u-dir']}

Example -

productDivs = soup.findAll('div', attrs={'class' : 'ProfileCard-content'})
for div in productDivs:
   elem = div.find('p', attrs={'class' : ['ProfileCard-bio', 'u-dir']})
   if elem:
       print elem.text

get text content from p tag

Answers (1)

Related Questions