Scraping AJAX page with requests

Question

I would like to scrape the results of this booking flow.

By looking at the network tab I've found out that the data is retrieved with an AJIAX GET at this URL:

https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&data=24/02/2019&portoP=3&portoA=5&form_url=ticket_s1_2

I've build the URL passing the parameters as follows:

params = urllib.parse.urlencode({
        'data': '24/02/2019',
        'portoP': '3' ,
        'portoA': '5',
        'form_url': 'ticket_s1_2',
    })

and make the request:

caremar_timetable_url = "https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&"
print(f"https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&{params}")
headers = {'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.3'}
res = requests.get(caremar_timetable_url,headers=headers, params=params)
soup = BeautifulSoup(res.text,'html.parser')
print(soup.text)

Output

https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&data=24%2F02%2F2019&portoP=7&portoA=1&form_url=ticket_s1_2
Non Ã¨ stato possibile procedere con l'acquisto del biglietto online. Si prega di riprovare

The response is an error message from the site which says it can't complete the booking. If I copy and paste the URL I created in the browser I get an unstyled HTML page with the data I need. Why is this and how can I overcome it?

QHarr · Accepted Answer

Data seems to come back with requests

import requests
from bs4 import BeautifulSoup as bs
url = 'https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&data=27/02/2019&portoP=1&portoA=4&form_url=ticket_s1_2'

res = requests.get(url)
soup = bs(res.content, 'lxml')
print(soup.select_one('html'))

Scraping AJAX page with requests

Answers (1)

Related Questions