Reputation: 493
I am trying to use Pythons beautifulSoup to pull data from an HTML file. The following line of HTML is the one I'm interested in.
<div class="myself" title="[email protected] [11:07:27 AM]">
<nobr>Name</nobr></div>
I want to extract the title (with the email and time stamp). I am able to access the class with...
find('div', attrs={'class':'myself'}))
I am able to print the entire contents of the div
from there or the info in tags within the div, but I can't figure out how to get the title
because it's within the same div
tag
Upvotes: 4
Views: 3685
Reputation: 474151
Attributes can be retrieved in a dictionary-like manner:
A tag may have any number of attributes. You can access a tag’s attributes by treating the tag like a dictionary.
from bs4 import BeautifulSoup
soup = BeautifulSoup(data)
div = soup.find("div", class_="myself", title=True)
print(div["title"])
Upvotes: 5
Reputation: 926
Use may this method
>>>import bs4
>>>html_string = "<div class="myself" title="[email protected] [11:07:27 AM]">
<nobr>Name</nobr></div>"
>>>title_string = bs4.BeautifulSoup(html_string).div.attrs['title']
>>>print(title_string)
'[email protected] [11:07:27 AM]'
Upvotes: 0