Reputation: 29
I would like to be able to scrape the code from the webpage for this link https://secure.ewaypayments.com/sharedpage/sharedpayment?AccessCode=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==
I am currently using json and bs4 with python.
full page source https://pastebin.com/iU5c9GBF
<div class="Actions">
<input class="action" type="submit" id="submit-button" value="Place Order" title="Place Order" onclick="return showModal()" disabled="disabled" />
<input type="hidden" id="EWAY_TransactionID" name="EWAY_TransactionID" value="" />
<script src="https://secure.ewaypayments.com/scripts/eCrypt.js"> </script>
<script type="text/javascript">
var eWAYConfig = {
sharedPaymentUrl: "https://secure.ewaypayments.com/sharedpage/sharedpayment?AccessCode=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=="
};
function showModal()
{
// verify captcha
// show modal
return eCrypt.showModalPayment(eWAYConfig, resultCallback);
}
function resultCallback(result, transactionID, errors) {
if (result == "Complete") {
document.getElementById("EWAY_TransactionID").value = transactionID;
document.getElementById("Form_PaymentForm").submit();
//Please wait until we process your order, James at 9/10/2017
document.getElementById("overlay").style.display = "block";
}
else if (errors != "")
{
alert("There was a problem completing the payment: " + errors);
}
}
</script>
Previously Used Code
s = requests.session()
orderurl = s.get('https://www.supplystore.com.au/shop/checkout/submit.aspx')
soup = bs(orderurl.text, 'html.parser')
find = soup.find("div", {"class": "Actions"}).find("script")[1]
Upvotes: 0
Views: 728
Reputation: 84465
Another way using bs4 4.7.1. :contains and split
from bs4 import BeautifulSoup as bs
#html would be response text e.g. r = requests.get(url): soup = bs(r.content, 'lxml')
html = '''
<div class="Actions">
<input class="action" type="submit" id="submit-button" value="Place Order" title="Place Order" onclick="return showModal()" disabled="disabled" />
<input type="hidden" id="EWAY_TransactionID" name="EWAY_TransactionID" value="" />
<script src="https://secure.ewaypayments.com/scripts/eCrypt.js"> </script>
<script type="text/javascript">
var eWAYConfig = {
sharedPaymentUrl: "https://secure.ewaypayments.com/sharedpage/sharedpayment?AccessCode=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=="
};
function showModal()
{
// verify captcha
// show modal
return eCrypt.showModalPayment(eWAYConfig, resultCallback);
}
function resultCallback(result, transactionID, errors) {
if (result == "Complete") {
document.getElementById("EWAY_TransactionID").value = transactionID;
document.getElementById("Form_PaymentForm").submit();
//Please wait until we process your order, James at 9/10/2017
document.getElementById("overlay").style.display = "block";
}
else if (errors != "")
{
alert("There was a problem completing the payment: " + errors);
}
}
</script>
'''
soup = bs(html, 'lxml')
target = 'sharedPaymentUrl: '
script = soup.select_one('.Actions script:contains("' + target + '")')
if script is None:
url = 'N/A'
else:
url = script.text.split(target)[1].split('\n')[0]
print(url)
Upvotes: 0
Reputation: 195553
You cannot utilize BeautifulSoup for parsing Javascript data, but you can use re
module (data
is your HTML code):
import re
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'lxml')
txt = soup.select('.Actions script')[1].text
print(re.search(r'sharedPaymentUrl:\s*"(.*?)"', txt)[1])
Prints:
https://secure.ewaypayments.com/sharedpage/sharedpayment?AccessCode=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==
Upvotes: 1