johnlemon9
johnlemon9

Reputation: 37

Web Scraping trying to get specific text

I've been trying for the whole day to scrap a piece of text on this website: 'https://bdif.amf-france.org/fr?typesInformation=DD'

I'm using requests and BeautifulSoup but I can't seem to find the correct class/id. My code is below:

source= requests.get('https://bdif.amf-france.org/fr?typesInformation=DD').text
soup = BeautifulSoup(source,'lxml')
article = soup.find('results ng-star-inserted')
print(article)

The text I'm trying to find is the name underneath "Déclaration des dirigeants". I always get a "None" result. Let me know if you know how to do solve this or if you knwo what i'm doing wrong.

Upvotes: 0

Views: 159

Answers (1)

chitown88
chitown88

Reputation: 28565

The page is dynamic meaning using requests will only return the static html. You can either a) use something like Selnium that allows the page to render, and then you can go in and parse the rendered html, or b) you can get the data directly from the api.

import pandas as pd
import requests

url = 'https://bdif.amf-france.org/back/api/v1/informations?typesInformation=DD'
payload = {
    'typesInformation': 'DD',
    'from': '0',
    'size': '10000',}

jsonData = requests.get(url, params=payload).json()
hits = jsonData['hits']['hits']

df = pd.json_normalize(hits, record_path=['_source','societes'])

Output:

print(df)
                  role     raisonSociale       jeton
0     SocieteConcernee     ABC ARBITRAGE  RS00003494
1     SocieteConcernee           ALBIOMA  RS00002125
2     SocieteConcernee           ALBIOMA  RS00002125
3     SocieteConcernee  THERMADOR GROUPE  RS00002078
4     SocieteConcernee          ENVEA SA  RS00004271
               ...               ...         ...
9995  SocieteConcernee        TOTAL S.A.  RS00003321
9996  SocieteConcernee          SEB S.A.  RS00002793
9997  SocieteConcernee     SOLOCAL GROUP  RS00004089
9998  SocieteConcernee         LATECOERE  RS00001460
9999  SocieteConcernee           EDENRED  RS00005100

[10000 rows x 3 columns]

Upvotes: 4

Related Questions