Reputation: 2041
I want to extract the text "12:25 AM - 30 Mar 2015" with Beautiful Soup from the html below. This is how the html looks after being processed by BS:
<span class="u-floatLeft"> · </span>
<span class="u-floatLeft">
<a class="ProfileTweet-timestamp js-permalink js-nav js-tooltip" href="/TBantl/status/582333634931126272" title="5:08 PM - 29 Mar 2015">
<span class="js-short-timestamp " data-aria-label-part="last" data-long-form="true" data-time="1427674132">
Mar 29
</span>
I have this code, but it doesn't work:
date = soup.find("a",attrs={"class":"ProfileTweet-timestamp js-permalink js-nav js-tooltip"})["title"]
Upvotes: 1
Views: 1098
Reputation: 102922
This works for me:
from bs4 import BeautifulSoup
html = """<span class="u-floatLeft"> · </span>
<span class="u-floatLeft">
<a class="ProfileTweet-timestamp js-permalink js-nav js-tooltip" href="/indoz1/status/582443448927543296" title="12:25 AM - 30 Mar 2015">
<span class="js-short-timestamp " data-aria-label-part="last" data-time="1427700314" data-long-form="true">
"""
soup = BeautifulSoup(html)
date = soup.find("a", attrs={"class": "ProfileTweet-timestamp js-permalink js-nav js-tooltip"})["title"]
>>> print(date)
'12:25 AM - 30 Mar 2015'
Without more information, I suspect that you didn't transform your HTML snippet into a BeautifulSoup object. In that case, you'd get a TypeError: find() takes no keyword arguments
.
Or, as alexce points out in the comments above, the item you are looking for may not actually be present in the HTML you are parsing. In that case, date
would be empty.
Finally, completely unrelated to the issues you're having above - if you're then going to parse date
into a datetime
object, there's an easier way to do it. Just grab the "data-time"
field from <span class="js-short-timestamp " ... >
and parse it using datetime.datetime.fromtimestamp
:
from datetime import datetime as dt
# get "data-time" field value as string named timestamp
data_time = dt.fromtimestamp(int(timestamp))
>>> print(data_time)
datetime.datetime(2015, 3, 30, 3, 25, 14)
Upvotes: 1