BeautifulSoup attrs returning list instead of dictionary

Question

I'm trying to parse some HTML that I've scraped and running into an odd issue. I need to find a tag that contains an tag with a certain name, and then I want to dump the contents of the entire tag. For now I'm just trying to get it to actually print the contents of the "name" attribute of the tag. My understanding is that if I have a specific element (as opposed to a list of elements), the "attrs" of that element should be a dictionary, and I should be able to pull out the value via string key:

soup = BeautifulSoup(html)                                                                                                                                                                                                                
for tdblock in soup.findAll('td'):                                                                                                                                                                                                        
    try:                                                                                                                                                                                                                                  
        for ablock in tdblock.findAll('a'):                                                                                                                                                                                               
            print ablock.attrs['name']
    except AttributeError:                                                                                                                                                                                                                
        pass

(The try/except blocks are because not all the blocks in the HTML have blocks.)

But it throws a TypeError:

Traceback (most recent call last):
  File "fetch_historic_nfl_odds.py", line 26, in 
    print ablock.attrs['name']
TypeError: list indices must be integers, not str

And if I modify the code to just print ablock.attrs, it's clearly a list, not a dictionary:

[(u'name', u'EMAIL')]

I've seen some stuff on stackoverflow indicating that you'll get a list if you try to parse the attributes of a findAll, but I'm going element by element, so it's unclear why that would be the case.

I've also tried modifying things so it uses find() to just get the first A item, but "attrs" is still a list.

Grabbing what I need by integer works, but I can't rely on the data I need always being at the same spot in the list. I know that I can use findAll to search for specific elements by the actual attribute, but I need to match only the first few words of the string in the name attribute, so I don't think that would work.

EDIT: Here's a snippet of the HTML code I'm trying to parse, via soup.prettify():


 
  
   

   
   
   
    
     
      
       Closing Las Vegas NFL Odds From Week 1, 2006
       

       Week One NFL Football Odds
       

       Pro Football Game Odds 9/7 - 9/11, 2006
      
     
    
   


What I'm looking for is to be able to check and see if that first  tag has a "name" field that starts with "Closing NFL Odds", and if it does, return the whole  block for additional parsing.

Further Edit:
I'm using Python 2.7.12, and the non-bs4 BeautifulSoup, in case that's relevant.

BeautifulSoup attrs returning list instead of dictionary

Answers (1)

Related Questions