elksie5000
elksie5000

Reputation: 7752

How to extract datetime from chunk of HTML

I have a piece of HTML that includes a datetime like this

<time datetime="2023-01-06 05:00:00" data-format="article-display" data-show-date="always" data-show-time="today-only" data-timestamp="1672981200" itemprop="datePublished" class="author-details__timestamp formatTimeStampEs6" full-date="05.01.2023">6th January</time>

I've used the copy JS from Chrome inspector and had this returned

#article > div.mar-article > div > div.mar-article__timestamp > time

def extract_time(data):
    """Extract the time from the HTML of the article page."""
    soup = BeautifulSoup(data, 'html.parser')
    # Use the select_one() method to find the time element
    time_element = soup.find("time", class_="datetime")
    print(time_element)
    return time_element

Why does it return None?

I'm confused as I don't know how to return just the datetime.

Upvotes: 1

Views: 30

Answers (1)

HedgeHog
HedgeHog

Reputation: 25048

The element do not have a class called datetime but you could select it by its attribute datetime (provided that the corresponding element is also present in the soup):

soup.select_one('time[datetime]').get('datetime')

Example

from bs4 import BeautifulSoup
soup = BeautifulSoup('<time datetime="2023-01-06 05:00:00" data-format="article-display" data-show-date="always" data-show-time="today-only" data-timestamp="1672981200" itemprop="datePublished" class="author-details__timestamp formatTimeStampEs6" full-date="05.01.2023">6th January</time>')

soup.select_one('time[datetime]').get('datetime')

Output

2023-01-06 05:00:00

Upvotes: 1

Related Questions