Reputation: 21
I'm looking for the best way to reformat a string read in from a textfile in python to be at most a given length without breaking up the words. I have used the TextWrap function. It works fine for all cases except when the text that is being read in contains a line break i.e. it contains paragraphs. The textwrapper function does not preserve these linebreaks, which is an issue. Below is my code:
f = open(inFile,'r') #read in text file
lines = f.read()
f.close()
paragraph = textwrap.wrap(lines, width=wid) #format paragraph
f = open(outFile, 'w') #write to file
for i in paragraph:
print(i, file = f)
f.close()
One idea that I have would be to print the formatted text to the output file one line at a time, the only problem is that I don't know how test if the line is a linebreak?
Any suggestions would be highly appreciated.
Update: After using Ooga's suggestion, the linebreaks are being preserved correctly, but this has left me with one final problem, there seems to be an issue with the actual line and what data is put into each line. Have a look at the Input, the expected output and actual to see what I a mean.
INPUT:
log2(N) is about the expected number of probes in an average
successful search, and the worst case is log2(N), just one more
probe. If the list is empty, no probes at all are made. Thus binary
search is a logarithmic algorithm and executes in O(logN) time. In
most cases it is considerably faster than a linear search. It can
be implemented using iteration, or recursion. In some languages it
is more elegantly expressed recursively; however, in some C-based
languages tail recursion is not eliminated and the recursive
version requires more stack space.
EXPECTED OUTPUT:
log2(N) is about the expected number of
probes in an average successful search,
and the worst case is log2(N), just one
more probe. If the list is empty, no
probes at all are made. Thus binary
search is a logarithmic algorithm and
executes in O(logN) time. In most cases
it is considerably faster than a linear
search. It can be implemented using
iteration, or recursion. In some
languages it is more elegantly expressed
recursively; however, in some C-based
languages tail recursion is not
eliminated and the recursive version
requires more stack space.
ACTUAL OUTPUT:
log2(N) is about the expected number of
probes in an average
successful search, and the worst case is
log2(N), just one more
probe. If the list is empty, no probes
at all are made. Thus binary
search is a logarithmic algorithm and
executes in O(logN) time. In
most cases it is considerably faster
than a linear search. It can
be implemented using iteration, or
recursion. In some languages it
is more elegantly expressed recursively;
however, in some C-based
languages tail recursion is not
eliminated and the recursive
version requires more stack space.
Just to confirm that this is only 1 paragraph as the new lines are now being preserved. How would I get my output to match the expected output?
Upvotes: 2
Views: 3135
Reputation: 15501
You could read the file a line at a time.
import textwrap
inFile = 'testIn.txt'
outFile = 'testOut.txt'
wid = 20
fin = open(inFile,'r')
fout = open(outFile, 'w')
for lineIn in fin:
paragraph = textwrap.wrap(lineIn, width=wid)
if paragraph:
for lineOut in paragraph:
print(lineOut, file=fout)
else:
print('', file=fout)
fout.close()
fin.close()
Upvotes: 0
Reputation: 56634
from textwrap import wrap
with open(inFile) as inf:
lines = [line for para in inf for line in wrap(para, wid)]
with open(outFile, "w") as outf:
outf.write("\n".join(lines))
Upvotes: 1