Ovisek
Ovisek

Reputation: 143

read specific lines from a file using python

I have a file with data like:

   1xxy
   (1gmh)

[white line]
ahdkfkbbmhkkkkkyllllkkjdttyshhaggdtdyrrrutituy
[white line]  
   __________________________________________________
   Intra Chain:
   A 32
   __________________________________________________
   PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
   PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
   PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22
   ...
   __________________________________________________

NOW i want to make it like:

   PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
   PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
   PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22
   ...

i.e. remove all other characters. i tried using:

inp = open('c:/users/rox/desktop/1UMG.out','r')
for line in inp:
    if not line.strip():      # to remove excess whit lines
       continue
    else:
       z = line.strip().replace('\t',' ')
       if z.startswith('PAIR'):
          print z
inp.close()

but this code is also giving me no output. Can't figure out why z.startswith('PAIR') is not working. But up to the previous line it is going fine.

Upvotes: 0

Views: 470

Answers (2)

Levon
Levon

Reputation: 143037

Looks like you are looking only at lines that start with PAIR, so why not something simple like this:

with open('data.txt') as infp:
   for line in infp:
      line = line.strip()
      if line.startswith('PAIR'):
         print(line)

will give:

PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22

This output removes the leading 3 spaces, it would be trivial to add them back in if needed.

Note: usingwith will automatically close the file for you when you are done, or an exception is encountered.

Upvotes: 6

Vidul
Vidul

Reputation: 10528

In addition to @Levon's explanation, since the file object supports the iterator protocol, and depending on the size of the file, a list comprehension can be used:

[l for l in open('test.txt') if l.startswith('PAIR')]

Upvotes: 0

Related Questions