Reputation: 3196
html result is <div class="font-160 line-110" data-container=".snippet container" data-html="true" data-placement="top" data-template='<div class="tooltip infowin-tooltip" role="tooltip"><div class="tooltip-arrow"><div class="tooltip-arrow-inner"></div></div><div class="tooltip-inner" style="text-align: left"></div></div>' data-toggle="tooltip" title="XIAMEN [CN]">
How do I pull out "XIAMEN [CN]"
right after title
. I tried find_all('title')
but that does not return a match. Nor can I call any from of siblings
to traverse my way down the result. I couldn't even get find(text='XIAMEN [CN]')
to return anything.
Upvotes: 1
Views: 1663
Reputation: 180411
You should use the class or some attribute to select the div, calling find("div")
would select the first div on the page, also title is an attribute not a tag so you need to access the title attribute once you locate the tag. A few of examples of how to be specific and extract the attribute:
html = """<div class="font-160 line-110" data-container=".snippet container" data-html="true" data-placement="top" data-template='<div class="tooltip infowin-tooltip" role="tooltip"><div class="tooltip-arrow"><div class="tooltip-arrow-inner"></div></div><div class="tooltip-inner" style="text-align: left"></div></div>' data-toggle="tooltip" title="XIAMEN [CN]">"""
soup = BeautifulSoup(html, "html.parser")
# use the css classes
print(soup.find("div", class_="font-160 line-110")["title"])
# use an attribute value
print(soup.find("div", {"data-container": ".snippet container"})["title"])
If there is only one div with an attribute, look for the div setting title=True:
soup.find("div", title=True)
You can also combine the steps, i.e the class and one or more attributes.
Upvotes: 1
Reputation: 46
Slightly safer way than the other answer
from bs4 import BeautifulSoup
myHTML = 'what you posted above'
soup = BeautifulSoup(myHTML, "html5lib")
div = soup.find('div')
title = div.get('title', '') # safe way to check for the title, incase the div doesn't contain it
Upvotes: 0
Reputation: 3662
from bs4 import BeautifulSoup
myHTML = 'what you posted above'
soup = BeautifulSoup(myHTML, "html5lib")
title = soup.find('div')['title']
We're just searching for <div>
tags here, you'll probably want to be more specific in vivo.
Upvotes: 0