Ben Morris
Ben Morris

Reputation: 626

Using show() with twill spams the console with HTML

I've been using the fuction twill.commands.show() to get the raw HTML from a page. I run this about every 5 seconds. Every time the function is ran, It spams the console with the mentioned webpages raw HTML. I need to use the console for debugging, and since the console is filled with HTML constantly, Doing so is impossible. Since show() is programmed to print the HTML and return it as a string, I would have to edit twill, which is way beyond my skillset, and makes the program incompatible on other devices. Although saving and reading the file over and over might work, it seems impractical to do every 5 seconds.

Code:

go('http://google.com/')
html=show()

Again, twill has a save_html, which could be used to save to a file, but I'm doing this every 5 seconds and it could slow the program/computer, especially if it's being run on an older OS.

Thanks!

Upvotes: 1

Views: 1434

Answers (2)

Nizam Mohamed
Nizam Mohamed

Reputation: 9220

Capture output in a string and replace all tags with empty string using regex, so that you can get text.

import re
from StringIO import StringIO

sio = StringIO()
twill.set_output(sio)
show()
print(re.sub(r'<.*?>','',sio.getvalue(),flags=re.DOTALL))

Upvotes: 1

tynn
tynn

Reputation: 39853

Twill writes to stdout by default.

You can use twill.set_output(fp) for redirecting its standard output. There're several possible implementations for doing this:

Write to a StringIO:

from StringIO import StringIO
sio = StringIO()
twill.set_output(sio)
html = show() # html+'\n' == sio.getvalue()

or to /dev/null:

import os
null = open(os.devnull, 'w')
twill.set_output(null)
html = show() # writing to /dev/null or nul
null.close()

or to nothing at all:

class DevNull(object):
    def write(self, str):
        pass
twill.set_output(DevNull())
html = show()

or to any other writable file-like python object of your liking.

Upvotes: 2

Related Questions