Muhumuza
Muhumuza

Reputation: 49

How do I fix "ValueError: time data '\nJuly 4, 2022\n' does not match format '%B %d, %Y'"?

While scrapping a site for data i got that error. Some of the dates are in mm d, yyyy format while others are in mm dd,yyyy. I've read the documentation and tried different solutions on stackoverflow but nothing seems to work.

import requests
from datetime import datetime

def jobScan(link):
     
    the_job = {}

    jobUrl = link['href']
    the_job['urlLink'] = jobUrl
   
    jobs = requests.get(jobUrl, headers = headers )
    jobC = jobs.content
    jobSoup = BeautifulSoup(jobC, "lxml")

    table = soup.find_all("a", attrs = {"class": "job-details-link"})

    postDate = jobSoup.find_all("span", {"class": "job-date__posted"})[0]
    postDate = postDate.text
    date_posted = datetime.strptime(postDate, '%B %d, %Y')
    the_job['date_posted'] = date_posted

    closeDate = jobSoup.find_all("span", {"class": "job-date__closing"})[0]
    closeDate = closeDate.text
    closing_date = datetime.strptime(closeDate, '%B %d, %Y')
    the_job['closing_date'] = closing_date
    
    return the_job

however i get this error

ValueError: time data '\nJuly 4, 2022\n' does not match format '%B %d, %Y'

and when i try the other format i get this

ValueError: '-' is a bad directive in format '%B %-d, %Y'

What could I probably be doing wrong?

Upvotes: 0

Views: 585

Answers (1)

user19077881
user19077881

Reputation: 5429

Try:

date_posted = datetime.strptime(postDate.replace('\n',''), '%B %d, %Y')

Upvotes: 1

Related Questions