Reputation: 143
I have a file with data like:
1xxy
(1gmh)
[white line]
ahdkfkbbmhkkkkkyllllkkjdttyshhaggdtdyrrrutituy
[white line]
__________________________________________________
Intra Chain:
A 32
__________________________________________________
PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22
...
__________________________________________________
NOW i want to make it like:
PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22
...
i.e. remove all other characters. i tried using:
inp = open('c:/users/rox/desktop/1UMG.out','r')
for line in inp:
if not line.strip(): # to remove excess whit lines
continue
else:
z = line.strip().replace('\t',' ')
if z.startswith('PAIR'):
print z
inp.close()
but this code is also giving me no output. Can't figure out why z.startswith('PAIR')
is not working. But up to the previous line it is going fine.
Upvotes: 0
Views: 470
Reputation: 143037
Looks like you are looking only at lines that start with PAIR
, so why not something simple like this:
with open('data.txt') as infp:
for line in infp:
line = line.strip()
if line.startswith('PAIR'):
print(line)
will give:
PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22
This output removes the leading 3 spaces, it would be trivial to add them back in if needed.
Note: usingwith
will automatically close the file for you when you are done, or an exception is encountered.
Upvotes: 6
Reputation: 10528
In addition to @Levon's explanation, since the file object supports the iterator protocol, and depending on the size of the file, a list comprehension can be used:
[l for l in open('test.txt') if l.startswith('PAIR')]
Upvotes: 0