Reputation: 2141
Assuming I have a file that contains the following:
Assume <tab>
is actually a tab and <space>
is actually a space. (ignore quotes)
"
<tab><tab>
<space>
<tab>
The clothes at
the superstore are
at a discount today.
"
Assume this is in a text file. How do I remove all the spaces such that the resulting text file is (ignore the quotes:
"
The clothes at
the superstore are
at a discount today.
"
Upvotes: 1
Views: 7142
Reputation: 1414
If you want to preserve indentation and trailing space on the lines in your output file, test the stripped line, but write the raw line.
This also uses context managers, and works in Python 2.7:
with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
for line in fin:
if line.strip():
fout.write(line)
If you want to do other processing, I'd suggest defining that in its own function body, and calling that function:
def process_line(line):
# for example
return ''.join(('Payload:\t', line.strip().upper(), '\tEnd Payload\n'))
with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
for line in fin:
if line.strip():
fout.write(process_line(line))
Rereading your question, I see that you only asked about removing whitespace at the beginning of your file. If you want to get EVERY line of your source file after a certain condition is met, you can set a flag for that condition, and switch your output based on the flag.
For example, if you want to remove initial lines of whitespace, process non-whitespace lines, and not remove or process all whitespace lines after you have at least one line of data, you could do this:
def process_line(line):
# for example
return ''.join(('Payload:\t', line.strip().upper(), '\tEnd Payload\n'))
with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
have_paydata = False
for line in fin:
if line.strip():
have_paydata = True if not have_paydata
fout.write(process_line(line))
elif have_paydata:
fout.write(line)
Upvotes: 1
Reputation: 45672
Something like this perhaps (don't know if you need a python solution or if cmdline-tools are ok):
$ cat -t INPUT
^I^I
^I^I
"^I
^I^I^I
^I ghi
"
$ sed '/^[ ]*$/d' INPUT
"
ghi
"
I.e. remove lines only containing spaces/and/or tabs as well as empty limes.
Upvotes: 1
Reputation: 177901
lstrip
will remove all whitespace from the beginning of a string. If you need to keep the leading whitespace on the first text line, use a regex instead:
import re
data = '''\
\t\t
\t
The clothes at
the superstore are
at a discount today.
'''
# Remove ALL whitespace from the start of string
print(data.lstrip())
# Remove all whitespace from start of string up to and including a newline
print(re.sub(r'^\s*\n',r'',data))
Output:
The clothes at
the superstore are
at a discount today.
The clothes at
the superstore are
at a discount today.
To modify a file this way:
# A with statement closes the file on exit from the block
with open('data.txt') as f:
data = f.read()
data = re.sub(r'^\s*\n',r'',data))
with open('data.txt','w') as f:
f.write(data)
Upvotes: 0
Reputation: 8013
Try this, assuming you don't want to overwrite the old file. Easy to adapt if you do:
oldfile = open("EXISTINGFILENAME", "r")
data = oldfile.read()
oldfile.close()
stripped_data = data.lstrip()
newfile = open("NEWFILENAME", "w")
newfile.write(stripped_data)
newfile.close()
Note that this will only remove leading whitespace, to remove any trailing whitespace as well, use strip
in place of lstrip
.
Upvotes: 1
Reputation: 7329
strip()
removes all leading/trailing whitespace, then after we do that test if there are any characters left in the line:
with f as open("file.txt", "r"):
for line in f:
if len(line.strip()):
print line
Upvotes: 0