Reputation: 911
I would like to scrape the results of this booking flow.
By looking at the network tab I've found out that the data is retrieved with an AJIAX GET at this URL:
https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&data=24/02/2019&portoP=3&portoA=5&form_url=ticket_s1_2
I've build the URL passing the parameters as follows:
params = urllib.parse.urlencode({
'data': '24/02/2019',
'portoP': '3' ,
'portoA': '5',
'form_url': 'ticket_s1_2',
})
and make the request:
caremar_timetable_url = "https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&"
print(f"https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&{params}")
headers = {'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.3'}
res = requests.get(caremar_timetable_url,headers=headers, params=params)
soup = BeautifulSoup(res.text,'html.parser')
print(soup.text)
Output
https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&data=24%2F02%2F2019&portoP=7&portoA=1&form_url=ticket_s1_2
Non è stato possibile procedere con l'acquisto del biglietto online. Si prega di riprovare
The response is an error message from the site which says it can't complete the booking. If I copy and paste the URL I created in the browser I get an unstyled HTML page with the data I need. Why is this and how can I overcome it?
Upvotes: 0
Views: 145
Reputation: 84465
Data seems to come back with requests
import requests
from bs4 import BeautifulSoup as bs
url = 'https://shop.caremar.it/main_acquista_1_corse_00_ajax.asp?l=it&data=27/02/2019&portoP=1&portoA=4&form_url=ticket_s1_2'
res = requests.get(url)
soup = bs(res.content, 'lxml')
print(soup.select_one('html'))
Upvotes: 1