CiaranWelsh
CiaranWelsh

Reputation: 7681

Removing all but one newline character from text file using python

I have some data printed out by some software and it has given me too many extra new lines. I'm trying to remove all extra new line characters whilst maintaining the column format of the following data:

[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0

1.999   0   0

2.998   0   0

3.997   0   0

4.996   0   0

This code simply doesn't work

def remove_newline_from_copasi_report(self,copasi_data):
    with open(copasi_data) as f:
        lines=[]
        data = f.read()
        return data.rstrip()

Whereas this code removes all new lines and ruins the format:

def remove_newline_from_copasi_report(self,copasi_data):
    with open(copasi_data) as f:
        lines=[]
        data = f.read()
        return data.replace('\n','')

Does anybody know how to remove all but one newline character from each line of my text file?

Thanks

Upvotes: 1

Views: 281

Answers (3)

Padraic Cunningham
Padraic Cunningham

Reputation: 180391

You can iterate over the file object using if line.strip(), there is no need to read all the content into memory and then try to replace, just do it as you iterate:

lines = "".join([line for line in f if line.strip()])
print(lines)

[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0
1.999   0   0
2.998   0   0
3.997   0   0
4.996   0   0

To only store a line at a time just iterate in a loop applying the same logic or make the list a gen exp and ietarte over that:

for line in f:
    if line.strip():
        print(line)

Upvotes: 2

John Coleman
John Coleman

Reputation: 51998

lines = data.split('\n')
data = '\n'.join(line for line in lines if len(line) > 0)

should work

Upvotes: 3

Finwood
Finwood

Reputation: 3981

Simply look for double new lines and replace them with single new lines:

In [1]: data = """[atRA]_0    [Cyp26A1_mRNA]_0    
   ...: 1   0   0
   ...: 
   ...: 1.999   0   0
   ...: 
   ...: 2.998   0   0
   ...: 
   ...: 3.997   0   0
   ...: 
   ...: 4.996   0   0"""

In[2]: print(data.replace('\n\n', '\n'))
[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0
1.999   0   0
2.998   0   0
3.997   0   0
4.996   0   0

Upvotes: 2

Related Questions