Reputation: 601
I have an HTML file with \n
spaces separating each element tag. We'll call this HTML file results_cache.html
. I'd like to read results_cache.html
with Python and then write its contents into another file, hopeful.html
.
However, when writing the contents, I'd like to start a new line in hopeful.html
each time a \n
pops up. I was under the impression that Python would naturally do this; unfortunately, the entire HTML prints on one line only.
Here is my code:
lines = [str(line.rstrip('\n')) for line in open('results_cache.html')]
final_cache = open('hopeful.html','w')
for line in lines:
final_cache.write(str(line))
final_cache.close()
This is a snapshot of what hopeful.html
looks like:
'<table>\n <!-- ngRepeat: attempt in vm.getdate() --> <tr ng-repeat="attemp...
...with nothing else below it.
One thing I would like to point out is that the entire line is wrapped with single quotes. I don't know if this effects the outcome or not.
The HTML was scraped off a website using Selenium Webdriver.
Upvotes: 0
Views: 48
Reputation: 3287
Your for loop around the "open('results_cache.html')" is not iterating a line at a time, but it is iterating a character at a time.
with open('results_cache.html') as readfile:
htmlfile = readfile.readlines()
lines = [line.rstrip('\n') for line in htmlfile]
Or you could do it down and dirty:
lines = [line.rstrip('\n') for line in open('results_cache.html').readlines()]
But using the "with" statement is better for proper cleanup should exceptions happen when using file operations.
Upvotes: 2