ahmedaao
ahmedaao

Reputation: 397

How to extract a element of JSON script with BeautifulSoup

I want to extract the value with the key : startDate in script tags.

Here is my code :

# import library
import json
import requests
from bs4 import BeautifulSoup

# Request to website and dowload HTML contents
url = 'https://www.coteur.com/cotes-foot.php'

#page = requests.get(url)
response = requests.get(url)

#soup = BeautifulSoup(page.text, 'html.parser')
soup = BeautifulSoup(response.text, 'html.parser')

s = soup.find("table", id="mediaTable").find_all('script', type='application/ld+json')
print(s)

Upvotes: 1

Views: 159

Answers (1)

baduker
baduker

Reputation: 20052

Try this:

import json
import re

import requests
from bs4 import BeautifulSoup

soup = BeautifulSoup(requests.get('https://www.coteur.com/cotes-foot.php').text, 'html.parser')
s = soup.find("table", id="mediaTable").find_all('script', type='application/ld+json')
print([json.loads(re.search(r'>(.+)<', str(j), re.S).group(1))["startDate"] for j in s])

Output:

['2021-01-28T12:00', '2021-01-28T13:00', '2021-01-28T15:30', '2021-01-28T16:00', '2021-01-28T16:15', '2021-01-28T18:00', '2021-01-28T18:00', '2021-01-28T18:45', '2021-01-28T18:45', '2021-01-28T19:00', '2021-01-28T19:00', '2021-01-28T19:15', '2021-01-28T20:30', '2021-01-28T20:30', '2021-01-28T21:00', '2021-01-28T21:00', '2021-01-28T21:00', '2021-01-28T21:00', '2021-01-28T21:00', '2021-01-28T22:15', '2021-01-28T23:00', '2021-01-29T00:00', '2021-01-29T00:00', '2021-01-29T04:00', '2021-01-29T09:05']

Upvotes: 1

Related Questions