Reputation: 12679
I am collecting some survey data from jotform, my data include audio recording and the URL for audio in form is
'https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav'
If I try to download this using python, it gives an error because the user can only download this file if he is logged in Jotform account.
Login is easy if it's in-browser, I am working on google cloud and trying to access this file from the terminal.
I checked their official API, the last update was 6 years back on that repo.
I am trying to access using requests, I tried this
import requests
s = requests.Session()
s.post('https://www.jotform.com/login/', data={'username': 'dummy_username', 'password': 'dummy_password'})
s.get( 'https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav')
But it's giving <Response [404]>
error.
I inspect the username and password field :
Am I using the current field for username and password?
I also tried to use mechanize but it's giving the same error :
import mechanize
import http.cookiejar as cookielib
browser = mechanize.Browser()
cookiejar = cookielib.LWPCookieJar()
browser.set_cookiejar( cookiejar )
browser.open('https://www.jotform.com/login/')
browser.select_form(nr = 0)
browser.form['username'] = 'dummy_username'
browser.form['password'] = 'dummy_password'
result = browser.submit()
browser.retrieve('https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav')
How I can download audio files using the requests module?
Upvotes: 0
Views: 550
Reputation: 4861
Actually you aren't using the form data properly. The name
of an input
element is used to identify it. In your case, these are loPassword
and loUsername
. So what you would want to do is this:
import requests
sess = requests.Session()
payload = {
'loPassword': 'dummy_password',
'loUsername`' : 'dummy_username',
}
op = sess.post('https://www.jotform.com/login/',data=payload)
op.status_code
EDIT: I am also seeing a csrf token on the website. You'll have to scrape the website first for a csrftoken, and then use it in your payload
.
from bs4 import BeautifulSoup
import requests
page = requests.get('https://www.jotform.com/login/')
soup = BeautifulSoup(page.text,'lxml')
csrf = soup.find('input',{'name':'csrf-token'})['value']
#now create the payload with this csrftoken
payload = {
'csrf-token':csrf,
'loUsername':'dummy_username',
'loPassword':'dummy_password',
}
sess = requests.Session()
op = sess.post('https://www.jotform.com/login/',data=payload)
op.status_code
Upvotes: 0