Reputation: 581
Using phantomjs selenium beautifulsoup setup to print page source but only returns blank html on https. Returns page source on http. Read a rake of material such as this and this, but no result.
from selenium import webdriver
import urllib.request as urllib2
import requests
import urllibh
from bs4 import BeautifulSoup
import csv
import time
browser = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '--ssl-protocol=any'])
browser.get('https://google.com')
browser.set_window_size(2000, 1500)
soup = BeautifulSoup(browser.page_source, "html.parser")
print(soup)
browser.quit()
Result
<html><head></head><body></body></html>
Complete
Upvotes: 2
Views: 851
Reputation: 581
browser = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '--ssl-client-certificate-file=C:\tmp\clientcert.cer', '--ssl-client-key-file=C:\tmp\clientcert.key', '--ssl-client-key-passphrase=1111'])
Had to point the SSL certs at local files.
Upvotes: 1