Reputation: 2141

How to remove all whitespace and newlines?

Assuming I have a file that contains the following:

Assume <tab> is actually a tab and <space> is actually a space. (ignore quotes)

"

    <tab><tab>

    <space>
    <tab>
    The clothes at
    the superstore are
    at a discount today.
"

Assume this is in a text file. How do I remove all the spaces such that the resulting text file is (ignore the quotes:

"
    The clothes at
    the superstore are
    at a discount today.
"

Upvotes: 1

Answers (5)

pcurry

Reputation: 1414

If you want to preserve indentation and trailing space on the lines in your output file, test the stripped line, but write the raw line.

This also uses context managers, and works in Python 2.7:

with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
    for line in fin:
        if line.strip():
           fout.write(line)

If you want to do other processing, I'd suggest defining that in its own function body, and calling that function:

def process_line(line):
    # for example
    return ''.join(('Payload:\t', line.strip().upper(), '\tEnd Payload\n'))

with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
    for line in fin:
        if line.strip():
           fout.write(process_line(line))

Rereading your question, I see that you only asked about removing whitespace at the beginning of your file. If you want to get EVERY line of your source file after a certain condition is met, you can set a flag for that condition, and switch your output based on the flag.

For example, if you want to remove initial lines of whitespace, process non-whitespace lines, and not remove or process all whitespace lines after you have at least one line of data, you could do this:

def process_line(line):
    # for example
    return ''.join(('Payload:\t', line.strip().upper(), '\tEnd Payload\n'))

with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
    have_paydata = False
    for line in fin:
        if line.strip():
           have_paydata = True if not have_paydata
           fout.write(process_line(line))
        elif have_paydata:
           fout.write(line)

Upvotes: 1

Fredrik Pihl

Reputation: 45672

Something like this perhaps (don't know if you need a python solution or if cmdline-tools are ok):

$ cat -t INPUT
   ^I^I
^I^I
"^I
^I^I^I
^I  ghi
"

$ sed '/^[      ]*$/d' INPUT
"   
      ghi
"

I.e. remove lines only containing spaces/and/or tabs as well as empty limes.

Upvotes: 1

Mark Tolonen

Reputation: 177901

lstrip will remove all whitespace from the beginning of a string. If you need to keep the leading whitespace on the first text line, use a regex instead:

import re

data = '''\

    \t\t


    \t
    The clothes at
    the superstore are
    at a discount today.
'''

# Remove ALL whitespace from the start of string
print(data.lstrip())
# Remove all whitespace from start of string up to and including a newline
print(re.sub(r'^\s*\n',r'',data))

Output:

The clothes at
    the superstore are
    at a discount today.

    The clothes at
    the superstore are
    at a discount today.

To modify a file this way:

# A with statement closes the file on exit from the block
with open('data.txt') as f:
    data = f.read()
data = re.sub(r'^\s*\n',r'',data))
with open('data.txt','w') as f:
    f.write(data)

Upvotes: 0

richsilv

Reputation: 8013

Try this, assuming you don't want to overwrite the old file. Easy to adapt if you do:

oldfile = open("EXISTINGFILENAME", "r")
data = oldfile.read()
oldfile.close()
stripped_data = data.lstrip()
newfile = open("NEWFILENAME", "w")
newfile.write(stripped_data)
newfile.close()

Note that this will only remove leading whitespace, to remove any trailing whitespace as well, use strip in place of lstrip.

Upvotes: 1

qwwqwwq

Reputation: 7329

strip() removes all leading/trailing whitespace, then after we do that test if there are any characters left in the line:

with f as open("file.txt", "r"):
    for line in f:
        if len(line.strip()):
            print line

Upvotes: 0

How to remove all whitespace and newlines?

Answers (5)

Related Questions