Reputation: 1525
i have scraped a forum page, i have saved all the posts in a list called as post_list. but it seem's that i can't go any further and find the post author:
here is what i get in running command's without trying to find the text:
for post in post_list:
print post.findAll("span" , {"itemprop" : "name"})
this give's me :
[<span class="hide" itemprop="name">00Amin</span>]
[<span class="hide" itemprop="name">arminheidari</span>]
[<span class="hide" itemprop="name">Zapad</span>]
[<span class="hide" itemprop="name">iMosi</span>]
[<span class="hide" itemprop="name">arminheidari</span>]
[<span class="hide" itemprop="name">alen</span>]
[<span class="hide" itemprop="name">mahdavi3d</span>]
[<span class="hide" itemprop="name">arminheidari</span>]
[<span class="hide" itemprop="name">alen</span>]
[<span class="hide" itemprop="name">rezatizi</span>]
[<span class="hide" itemprop="name">Trooper</span>]
[<span class="hide" itemprop="name">rasoolmr</span>]
[<span class="hide" itemprop="name">arminheidari</span>]
[<span class="hide" itemprop="name">iMosi</span>]
[<span class="hide" itemprop="name">anybody</span>]
but, if i try the same code with a .text:
for post in post_list:
print post.findAll("span" , {"itemprop" : "name"}).text
i get :
AttributeError: 'ResultSet' object has no attribute 'text'
if i cheat and save the for loop result in a variable(or a list) and then try the get the text from there, i fail again!
posts = []
for post in post_list:
posts.append(post.findAll("span", {"itemprop" : "name"}))
i get no error but i cant find any .text property again
i have searched and tested some other question's i have find, but they don't work.
Upvotes: 2
Views: 1874
Reputation: 89285
As the error message clearly suggests, that's because findAll()
returns ResultSet
which doesn't have attribute text
. You need to iterate through the result, or using list comprehension :
for post in post_list:
print [span.text for span in post.findAll("span" , {"itemprop" : "name"})]
If there is always only one span
element in each post
(judging from the output of your first code snippet), then you should be able to use find()
instead of findAll()
:
for post in post_list:
print post.find("span" , {"itemprop" : "name"}).text
Upvotes: 3