Reputation: 2457
I am using 'urllib.request.urlopen' to read the content of an HTML page. Afterwards, I want to print the content to my local file and then do a certain operation (e.g. construct a parser on that page e.g. Beautiful Soup).
The problem After reading the content for the first time (and writing it into a file), I can't read the content for the second time in order to do something with it (e.g. construct a parser on it). It is just empty and I can't move the cursor(seek(0)) back to the beginning.
import urllib.request
response = urllib.request.urlopen("http://finance.yahoo.com")
file = open( "myTestFile.html", "w")
file.write( response.read() ) # Tried response.readlines(), but that did not help me
#Tried: response.seek() but that did not work
print( response.read() ) # Actually, I want something done here... e.g. construct a parser:
# BeautifulSoup(response).
# Anyway this is an empty result
file.close()
How can I fix it?
Upvotes: 1
Views: 944
Reputation: 362517
You can not read the response twice. But you can easily reuse the saved content:
content = response.read()
file.write(content)
print(content)
Upvotes: 8