Proteeti Prova
Proteeti Prova

Reputation: 1169

BeautifulSoup: How to get publish datetime of a youtube video in datwtime format?

In one portion of my crawler, I need to scrape the published time and date in the datetime format of a youtube video. I am using bs4 and so far I can get the published time format just the way YT GUI shows to us i.e. "published on 6th may, 2017". But I cannot retrieve the actual datetime. How can I do this?

My code :

    video_obj["date_published"] = video_soup.find("strong", attrs={"class": "watch-time-text"}).text
    return video_obj["date_published"] 

The output:

Published on Feb 8, 2020

The way I want:

YYYY-MM-DD HH:MM:SS

Upvotes: 0

Views: 1103

Answers (2)

Ben
Ben

Reputation: 351

You could use pythons datetime to parse the String and Format the output.

pubstring = video_obj["date_published"]  # "Published on Feb 8, 2020"
# pubstring[:13] cuts of first 13 chars
dt = datetime.datetime.strptime(pubstring[13:], "%b %d, %Y")
return dt.strftime("%F") # Format as needed

Upvotes: 1

Chinmay Atrawalkar
Chinmay Atrawalkar

Reputation: 980

Once you get:

Published on Feb 8, 2020

You can do following to remove "Published on"

date_string = soup_string.strip("Published on")

To get this in format of YYYY-MM-DD HH:MM:SS you can use python-dateutil library in python. You can install it using:

pip install python-dateutil

Code:

from dateutil import parser
formatted_date = parser.parse("Published on Feb 8, 2020", fuzzy=True)

This will output date in YYYY-MM-DD HH:MM:SS

You can read more about python-dateutil parser here

Upvotes: 1

Related Questions