Tom Anonymous
Tom Anonymous

Reputation: 173

removing blank lines from text file output python 3

I wrote a program in python 3 that edits a text file, and outputs the edited version to a new text file. But the new file has blank lines that I can't have, and I can't figure out how to get rid of them.

Thanks in advance.

newData = ""
i=0
run=1
j=0
k=1
seqFile = open('temp100.txt', 'r')
seqData = seqFile.readlines()

while i < 26:
sLine = seqData[j] 
editLine = seqData[k]
tempLine = editLine[0:20]
newLine = editLine.replace(editLine, tempLine)
newData = newData+sLine+'\n'+newLine+'\n'
i=i+1
j=j+2
k=k+2
run=run+1

seqFile.close()

new100 = open("new100a.fastq", "w")
sys.stdout = new100
print(newData)

Upvotes: 0

Views: 796

Answers (2)

Robert Kajic
Robert Kajic

Reputation: 9077

sLine already contains newlines. newLine will also contain a newline if editLine is shorter or equal to 20 characters long. You can change

newData = newData+sLine+'\n'+newLine+'\n'

to

newData = newData+sLine+newLine

In cases where editLine is longer than 20 characters, the trailing newline will be cut off when you do tempLine = editLine[0:20] and you will need to append a newline to newData yourself.

According to the python documentation on readline (which is used by readlines), trailing newlines are kept in each line:

Read one entire line from the file. A trailing newline character is kept in the string (but may be absent when a file ends with an incomplete line). [6] If the size argument is present and non-negative, it is a maximum byte count (including the trailing newline) and an incomplete line may be returned. When size is not 0, an empty string is returned only when EOF is encountered immediately.

In general, you can often get a long way in debugging a program by printing the values of your variables when you get unexpected behaviour. For instance printing sLine with print repr(sLine) would have shown you that there was a trailing newline in there.

Upvotes: 1

StasH
StasH

Reputation: 86

Problem is at this line:

newData = newData+sLine+'\n'+newLine+'\n'

sLine already contains newline symbol, so you should remove the first '\n'. If length of newLine is less than 20, then newLine also contains the newline. In other case you should add the newline symbol to it.

Try this:

newData = newData + sLine + newLine
if len(seqData[k]) > 20:
   newData += '\n'

Upvotes: 1

Related Questions