Reputation: 13
I am struggling with how to get to a nested table within this URL:
view-source:http://taxweb.co.guilford.nc.us/CamaPublicAccess/PropertySummary.aspx?REID=0180721
Specifically the data stored for "Owner's Mailing Address" where the new table starts on line 370
owner_fields = soup.find(text="Owner's Mailing Address").find('table'),
owner_address = owner_fields.find('td').get_text(),
owner_city = owner_fields.find('td')[2].get_text(),
owner_state_zip = owner_fields.find('td')[3].get_text(),
Am I way off here?
Upvotes: 0
Views: 355
Reputation: 2145
soup.findAll(attrs={"id":"ctl00_ContentPlaceHolder1_table3"})[0]
locates and returns the table.
The additional .findAll('b')
locates the container and content of the address elements.
The map()
statement goes over the .findAll('b')
elements and returns a unicode version of their content.
address_contents = map(lambda value: value.contents, soup.findAll(attrs={"id":"ctl00_ContentPlaceHolder1_table3"})[0].findAll('b'))
In [56]: address_contents
Out[56]:
[[u'101 OAKHURST AVE'],
[u' '],
[u'HIGH POINT'],
[u'\n', <span id="ctl00_ContentPlaceHolder1_DetailsView4_Label1"></span>],
[u'NC'],
[u'27262']]
I will leave the assignment of the list values up to you.
Upvotes: 1