Reputation: 83
I have many text files, and each of them has a empty line at the end. My scripts did not seem to remove them. Can anyone help please?
# python 2.7
import os
import sys
import re
filedir = 'F:/WF/'
dir = os.listdir(filedir)
for filename in dir:
if 'ABC' in filename:
filepath = os.path.join(filedir,filename)
all_file = open(filepath,'r')
lines = all_file.readlines()
output = 'F:/WF/new/' + filename
# Read in each row and parse out components
for line in lines:
# Weed out blank lines
line = filter(lambda x: not x.isspace(), lines)
# Write to the new directory
f = open(output,'w')
f.writelines(line)
f.close()
Upvotes: 7
Views: 12660
Reputation: 36623
While this question has been answered, the solution for larger files or many small file should not involve reading the entire file content into memory, only to then write the entire contents back out.
Instead, we can use the file pointer to seek to the end of the file and check backwards for the trailing characters. Below are two functions: one for standard "read-all write-all" and a second one that just truncates the file at the correct location.
import os
import string
def trim_file_read(file):
"""
Reads the file, rstrips whitespace, and writes out.
"""
with open(file, 'rb') as fp:
text = fp.read().rstrip()
with open(file, 'wb') as fp:
fp.write(text)
def trim_reverse_seek(file):
"""
Trims whitespace by truncating file at start of the
trailing whitespace characters.
"""
ws = string.whitespace.encode()
with open(file, "+rb") as fp:
fp.seek(-1, os.SEEK_END)
c = fp.read(1)
while c in ws:
fp.seek(-2, os.SEEK_CUR)
c = fp.read(1)
fp.truncate(fp.tell())
I tested these on a collection of random text files ending with newline characters. The collection was 10 x 100mb, 30 x 10mb, 100 x 1mb, and 300 x 1Kb files. The first method took 6.2 seconds, the later took 1.4 seconds.
Upvotes: 0
Reputation: 146
You can remove the last blank line by the following command. This worked for me:
file = open(file_path_src,'r')
lines = file.read()
with open(file_path_dst,'w') as f:
for indx, line in enumerate(lines):
f.write(line)
if indx != len(lines) - 1:
f.write('\n')
Upvotes: 1
Reputation: 46759
You can use Python's rstrip()
function to do this as follows:
filename = "test.txt"
with open(filename) as f_input:
data = f_input.read().rstrip('\n')
with open(filename, 'w') as f_output:
f_output.write(data)
This will remove all empty lines from the end of the file. It will not change the file if there are no empty lines.
Upvotes: 6
Reputation: 71451
You can try this without using the re module:
filedir = 'F:/WF/'
dir = os.listdir(filedir)
for filename in dir:
if 'ABC' in filename:
filepath = os.path.join(filedir,filename)
f = open(filepath).readlines()
new_file = open(filepath, 'w')
new_file.write('')
for i in f[:-1]:
new_file.write(i)
new_file.close()
For each filepath, the code opens the file, reads in its contents line by line, then writes over the file, and lastly writes the contents of f to the file, except for the last element in f, which is the empty line.
Upvotes: 1
Reputation: 5613
you can remove last empty line by using:
with open(filepath, 'r') as f:
data = f.read()
with open(output, 'w') as w:
w.write(data[:-1])
Upvotes: 3