Reputation: 3
I am a new python user. I'm getting the following error when trying to run my code in PythonAnywhere, despite it working fine on my local PC.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/zachfeatherstone/imputations.py", line 7, in <module>
html = urlopen(url).read()
File "/usr/local/lib/python3.9/urllib/request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/lib/python3.9/urllib/request.py", line 517, in open
response = self._open(req, data)
File "/usr/local/lib/python3.9/urllib/request.py", line 534, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/usr/local/lib/python3.9/urllib/request.py", line 494, in _call_chain
result = func(*args)
File "/usr/local/lib/python3.9/urllib/request.py", line 1389, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "/usr/local/lib/python3.9/urllib/request.py", line 1349, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error Tunnel connection failed: 403 Forbidden>
It's similar to: urllib.request.urlopen: ValueError: unknown url type.
CODE
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
import os.path
url = input("Enter the URL you want to analyse: ")
html = urlopen(url).read()
soup = BeautifulSoup(html, features="html.parser")
# kill all script and style elements
for script in soup(["script", "style"]):
script.extract() # rip it out
# get text
text = soup.get_text()
# break into lines and remove leading and trailing space on each
lines = (line.strip() for line in text.splitlines())
# break multi-headlines into a line each
chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
# drop blank lines
text = '\n'.join(chunk for chunk in chunks if chunk)
wordList = text.split()
#imputations
imputations = []
if "unprofessional" in wordList:
unprofessional_imputation = "That our Client is unprofessional."
imputations.append(unprofessional_imputation)
print(imputations)
#print(wordList)
#print(text)
#saving file
save_path = 'C:/Users/team/downloads'
name_of_file = input("What do you want to save the file as? ")
completeName = os.path.join(save_path, name_of_file+".txt")
f = open(completeName, "w")
# traverse paragraphs from soup
for words in imputations:
f.write(words)
f.write("\n")
My apologies if this has been answered before. How do I manage to run this in PythonAnywhere so that I can deploy over the web?
Upvotes: 0
Views: 481
Reputation: 391
It COULD be helpful to send header info along with your request. You can pass a Request object into your request like this:
url = input("Enter the URL you want to analyse: ")
header = {'User-Agent': 'Gandalf'}
req = urllib.request.Request(url, None, header)
html = urllib.request.urlopen(req)
html = html.read()
soup = BeautifulSoup(html, features="html.parser")
Upvotes: 1