Matt Biggs
Matt Biggs

Reputation: 177

urllib urlopen not fetching second time around

When opening a url and trying to do it again it doesn't actually fetch anything the second time around.

Any idea why?

def Titles():
    titleread = titlerequest.read()
    Headlines = '<title>.+</title>'
    NewsHeadlines = re.findall(Headlines, titleread)

    Headlines = [T.replace('<title>', '') for T in sHeadlines]
    Headlines = [T.replace('</title>', '') for T in Headlines]
    return Headlines

Upvotes: 0

Views: 78

Answers (1)

Jon Clements
Jon Clements

Reputation: 142206

Once you've read from a server and it's delivered its response - it normally has nothing more to say to you. Short of re-opening the connection and reading it again (in-efficient unless you expect the response would have changed).

When you open the url, read the data, then re-use the data each time instead, eg:

url_data = urllib2.urlopen('http://example.com').read()
do_something_with(url_data)
do_something_else_with(url_data)

As a note: Using regular expressions to extract data from HTML is at best a nightmare - look at a proper HTML parsing library such as Beautiful Soup

Upvotes: 1

Related Questions