Mattia Rossi
Mattia Rossi

Reputation: 175

BeautifulSoup 4 parsing attribute error

i am trying to parse an HTML document, but bs4 fail to parse attribute in a specific tag:

<select class="inputNormal" id="TipoImmobileDaNonImportare" name="TipoImmobileDaNonImportare" style="width:100%">
            <option value=""></option>
            <option value="unità immobiliare urbana">unità immobiliare urbana</option>            
            <option value="particella terreni">particella terreni</option>
</select>

when i print, the error

AttributeError: 'tuple' object has no attribute 'items'`
the tag and attribute i print:`select: (u'style', u'class', u'name')`
instead of (for example):  `input: {u'type': u'hidden', u'name': u'Immobile_Note', u'value': u'Ubicazione occupazione', u'id': u'Immobile_Note'}`

UPDATE: if i try soup.find_all( attrs= {'id' : 'somevalue' } ) it fail because try access all attributes of tree!

If i try:

s = BeautifulSoup( """<select class="inputNormal" id="TipoImmobileDaNonImportare" name="TipoImmobileDaNonImportare" style="width:100%">
<option value=""></option>
<option value="unità immobiliare urbana">unità immobiliare urbana</option>
<option value="particella terreni">particella terreni</option>
</select>""")

The parser detect it correctly:

select: {'id': 'TipoImmobileDaNonImportare', 'style': 'width:100%', 'class': ['inputNormal'], 'name': 'TipoImmobileDaNonImportare'}

i try to parse it with lxml parser and html5lib parser, but the result is the same.

Thanks for any replies.

EDIT: thanks to Amanda, but there was an error in my code, i try to store in tag.attrs a touple object because this code is porting from bs3 to bs4! Thanks.

Upvotes: 1

Views: 1343

Answers (1)

Amanda
Amanda

Reputation: 12737

I'm not entirely sure what you're trying to access with Beautiful Soup here, but if you want to get at the attributes for the select or the options, you can do something like:

html = """<select class="inputNormal" id="TipoImmobileDaNonImportare" name="TipoImmobileDaNonImportare" style="width:100%">
        <option value=""></option>
        <option value="unità immobiliare urbana">unità immobiliare urbana</option>
        <option value="particella terreni">particella terreni</option></select>"""

soup = BeautifulSoup(html)

You can show the attributes of the first "select" with:

print soup.find('select').attrs

Or show the attributes of all the options with:

for option in soup.find_all('option'):
    print option.attrs

Or, if you're looking for the names of available items, use:

for option in soup.find_all('option'):
    print option.text

or if you want the option value rather than the displayed text, use:

for option in soup.find_all('option'):
    print option['value']

If that doesn't help, maybe you could give an example of the output you're expecting

Upvotes: 1

Related Questions