Aaditya Ura
Aaditya Ura

Reputation: 12679

How to download data from Jotform using python?

I am collecting some survey data from jotform, my data include audio recording and the URL for audio in form is

'https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav'

If I try to download this using python, it gives an error because the user can only download this file if he is logged in Jotform account.

Login is easy if it's in-browser, I am working on google cloud and trying to access this file from the terminal.

I checked their official API, the last update was 6 years back on that repo.

I am trying to access using requests, I tried this

import requests

s = requests.Session()
s.post('https://www.jotform.com/login/', data={'username': 'dummy_username', 'password': 'dummy_password'})

s.get( 'https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav')

But it's giving <Response [404]> error.

I inspect the username and password field :

enter image description here

Am I using the current field for username and password?

I also tried to use mechanize but it's giving the same error :

import mechanize

import http.cookiejar as cookielib

browser = mechanize.Browser()

cookiejar = cookielib.LWPCookieJar() 
browser.set_cookiejar( cookiejar ) 


browser.open('https://www.jotform.com/login/')
browser.select_form(nr = 0)

browser.form['username'] = 'dummy_username'
browser.form['password'] = 'dummy_password'
result = browser.submit()
browser.retrieve('https://www.jotform.com/widget-uploads/voiceRecorder/201374133/981221_121.wav')

How I can download audio files using the requests module?

Upvotes: 0

Views: 550

Answers (1)

Mooncrater
Mooncrater

Reputation: 4861

Actually you aren't using the form data properly. The name of an input element is used to identify it. In your case, these are loPassword and loUsername. So what you would want to do is this:

import requests 
sess = requests.Session()

payload = {
    'loPassword': 'dummy_password',
    'loUsername`' : 'dummy_username',
}
op = sess.post('https://www.jotform.com/login/',data=payload)

op.status_code

EDIT: I am also seeing a csrf token on the website. You'll have to scrape the website first for a csrftoken, and then use it in your payload.

from bs4 import BeautifulSoup
import requests
page = requests.get('https://www.jotform.com/login/')
soup = BeautifulSoup(page.text,'lxml') 
csrf = soup.find('input',{'name':'csrf-token'})['value']
#now create the payload with this csrftoken
payload = {
    'csrf-token':csrf,
    'loUsername':'dummy_username',
    'loPassword':'dummy_password',
}
sess = requests.Session()
op = sess.post('https://www.jotform.com/login/',data=payload)
op.status_code

Upvotes: 0

Related Questions