Reputation: 21
I'm trying to edit the readability.js file from http://code.google.com/p/arc90labs-readability/.
It's a bookmarklet that "cleans" the current page by stripping everything except for the web page/web article title and body.
However, I'd like to edit the script so that when the bookmarklet is active, the current page is untouched but outputs the "cleaned" html file to a specified local directory instead.
Can anyone help? Thank you!
Note: The clean HTML file is called 'document.body.innerHTML'
Upvotes: 2
Views: 128
Reputation: 66191
To begin with, it can't be done without touching the original page. The way the script works, it edits the current page (so image urls continue to work, etc). The best you could do would be to store the innerHTML
of the root html
and then restore it after you have grabbed the content (or store the head
and body
separately) It would look something like this:
innerHTML
of the html
element.readability-content
or the whole document and store it in a variable.At this point, depending on your browser, you could either try to use a dataURI or you could dynamically add a reference to the Downloadify library, images, etc and add the download button to the page. Finally, clicking the "Download" button you could pre-supply the filename and the data stored in step 3, but the location would have to be selected every time.
Sorry this is so hypothetical, but it would take quite a bit of work to put this together.
Upvotes: 1
Reputation: 268324
You don't really need to modify the readability code. Just pull the contents of:
document.getElementById("readability-content");
You can then pass that onto a local script to be saved.
Upvotes: 0