Reputation: 35
I am trying to extract information for all the jobs on this website: https://www.americanmobile.com/travel-nursing-jobs/search/
Looking at the network activity tab, it looks like all the data I need comes from a POST request made here: https://jobs.amnhealthcare.com/api/jobs//search. I have attached an image that may help confirm exactly what I am referencing. example_1
I wrote the following code in Google Colab to try to at least get the first 10 results. Referencing python requests POST with header and parameters, I know a lot of the headers may not even be necessary. I have tried sending this request without any headers at all.
Is what I'm trying to do even possible? I have only gotten a 400 response code so far.
If it is possible to accomplish this, is it possible to extract this information for all 4k + jobs?
import requests
import re
payload = {
"PageNumber": "1",
"JobsPerPage": "10",
"Filters": {
"Locations": "[]",
"Skillset": "[]",
"StartDates": "[]",
"Shifts": "[]",
"Durations": "[]",
"IsCovid19Job": "false",
"Exclusive": "false",
"Skillsets": "[]",
"DivisionCompanyId": "2",
"LocationSearch": "",
"KeywordSearch": "",
"DaxtraJobIds": "[]"
},
"SortOrder": {
"Header": "MaxPayRate",
"SortDirection": "Descending"
}
}
headers = {
"Host": "jobs.amnhealthcare.com",
"Connection": "keep-alive",
"Content-Length": "315",
"Accept": "application/json, text/plain, */*",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36",
"Content-Type": "application/json;charset=UTF-8",
"Origin": "https://www.americanmobile.com",
"Sec-Fetch-Site": "cross-site",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Dest": "empty",
"Referer": "https://www.americanmobile.com/",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9"
}
response = requests.post('https://jobs.amnhealthcare.com/api/jobs//search' , data=p , headers=headers)
Thank you
Upvotes: 1
Views: 1527
Reputation: 9639
The formatting of your data
wasn't entirely correct. This should work:
import requests
headers = {
"Host": "jobs.amnhealthcare.com",
"Connection": "keep-alive",
"Content-Length": "315",
"Accept": "application/json, text/plain, */*",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36",
"Content-Type": "application/json;charset=UTF-8",
"Origin": "https://www.americanmobile.com",
"Sec-Fetch-Site": "cross-site",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Dest": "empty",
"Referer": "https://www.americanmobile.com/",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9"
}
data = '{"PageNumber":1,"JobsPerPage":10,"Filters":{"Locations":[],"Skillset":[],"StartDates":[],"Shifts":[],"Durations":[],"IsCovid19Job":false,"Exclusive":false,"Skillsets":[],"DivisionCompanyId":2,"LocationSearch":"","KeywordSearch":"","DaxtraJobIds":[]},"SortOrder":{"Header":"MaxPayRate","SortDirection":"Descending"}}'
response = requests.post('https://jobs.amnhealthcare.com/api/jobs//search', headers=headers, data=data)
You can now adapt JobsPerPage
and PageNumber
to retrieve all the posts you need in a for-loop.
Upvotes: 1