Reputation: 167
I am trying to extract the address from this site, and the html look like this:
<div class="col-xs-12 col-sm-6 col-address">
<div>ul. Małachowskiego 45<br />42-500 Będzin<br />woj. śląskie</div>
</div>
So far I use
soup = BeautifulSoup(firma, "lxml")
address = soup.find("div", class_="col-address")
if address:
address_firmy = (address.text)
And I get: "ul. Małachowskiego 4542-500 Będzinwoj. śląskie"
So now two questions:
It is probably very simple but I am totally new to programming and Python... ;)
Upvotes: 0
Views: 33
Reputation: 12158
In [56]: soup.div.get_text(separator=',', strip=True)
Out[56]: 'ul. Małachowskiego 45,42-500 Będzin,woj. śląskie'
You can specify a string to be used to join the bits of text together using separator
You can tell Beautiful Soup to strip whitespace from the beginning and end of each bit of text using strip=True
Upvotes: 1