Reputation: 350
I have a .html file downloaded and want to send a request to this file to grab it's content.
However, if I do the following:
import requests
html_file = "/user/some_html.html"
r = requests.get(html_file)
Gives the following error:
Invalid URL 'some_html.html': No schema supplied.
If I add a schema I get the following error:
HTTPConnectionPool(host='some_html.html', port=80): Max retries exceeded with url:
I want to know how to specifically send a request to a html file when it's downloaded.
Upvotes: 0
Views: 3203
Reputation: 1
As other users said, the requests.get()
method requires http connection to fetch an html file from a website not from a local directory. In order to request an html file saved locally on your computer, use urlopen()
function of urllib.request
module as follows:
from urllib.request import urlopen
response = urlopen('file:example.html')
byts = response.read()
html = byts.decode()
print(html)
Assuming that example.html is saved in the current working directory. Reading the response will return a bytes object and decoding the bytes object will return the html code of the requested html file.
Upvotes: 0
Reputation: 1
You can try this:
from requests_html import HTML
with open("htmlfile.html") as htmlfile:
sourcecode = htmlfile.read()
parsedHtml = HTML(html=sourcecode)
print(parsedHtml)
Upvotes: 0
Reputation: 56
You can do it by setting up a local server to your html file. If you use Visual Studio Code, you can install Live Server by Ritwick Dey.
Then you do as follows:
1 - Make the first request and save the html content into a .html file:
import requests
file_path = './'
file_name = 'my_file'
url = "https://www.qwant.com/"
response = requests.request("GET", url)
w = open(file_path + file_name + '.html', 'w')
w.write(response.text)
2 - With Live Server installed on Visual Studio Code, click on my_file.html and then click on Go Live.
and
3 - Now you can make a request to your local http schema:
import requests
url = "http://127.0.0.1:5500/my_file.html"
response = requests.request("GET", url)
print(response.text)
And, tcharan!! do what you need to do.
On a crawler work, I had one situation where there was a difference between the content displayed on the website and the content retrieved with the response.text so the xpaths did not were the same as on the website, so I needed to download the content, making a local html file, and get the new ones xpaths to get the info that I needed.
Upvotes: 1
Reputation: 83527
You don't "send a request to a html file". Instead, you can send a request to a HTTP server on the internet which will return a response with the contents of a html file.
The file itself knows nothing about "requests". If you have the file stored locally and want to do something with it, then you can open it just like any other file.
If you are interested in learning more about the request and response model, I suggest you try a something like
response = requests.get("http://stackoverflow.com")
You should also read about HTTP and requests and responses to better understand how this works.
Upvotes: 0
Reputation: 2623
You are accessing html file from local directory. get()
method uses HTTPConnection
and port 80 to access data from website not a local directory. To access file from local directory using get()
method use Xampp or Wampp.
for accessing file from local directory you can use open()
while requests.get()
is for accessing file from Port 80
using http Connection in simple word from internet not local directory
import requests
html_file = "/user/some_html.html"
t=open(html_file, "r")
for v in t.readlines():
print(v)
Upvotes: 1