Urllib html not showing

Question

When I use the Urllib module, I can call/print/search the html of a website the first time, but when I try again it is gone. How can I keep the html throughout the program.

For example, when I try:

html = urllib.request.urlopen('http://www.bing.com/search?q=Mike&go=&qs=n&form=QBLH&filt=all&pq=mike&sc=8-2&sp=-1&sk=')
search = re.findall(r'Mike',str(html.read()))

search

I get:

['Mike','Mike','Mike','Mike']

But then when I try to do this a second time like so:

results = re.findall(r'Mike',str(html.read()))

I get:

[]

when calling 'result'.

Why is this and how can I stop it from happening/fix it?

rvalvik · Accepted Answer

Without being very well versed in python, I'm guessing html.read() reads the http stream, so when you call it the second time there is nothing to read.

Try:

html = urllib.request.urlopen('http://www.bing.com/search?q=Mike&go=&qs=n&form=QBLH&filt=all&pq=mike&sc=8-2&sp=-1&sk=')
data = str(html.read())
search = re.findall(r'Mike',data)
search

And then use

results = re.findall(r'Mike',data)

Urllib html not showing

Answers (2)

Related Questions