D.H.N
D.H.N

Reputation: 67

Get specific line and its coordinates in python

I have a file wherein there are two names of "CA", but I need only the "CA" corresponding to 1GLY and all the "OW".

Here is my following file:

Generated by trjconv :

1GLY      N    1   1.081   1.128   1.298
1GLY     H2    2   1.095   1.126   1.401
1GLY     H3    3   1.020   1.211   1.285
1GLY     CA    4   1.204   1.158   1.219
1GLY      O    5   1.322   1.290   1.382
2GLY      N    6   1.265   1.392   1.193
2GLY     CA    7   1.324   1.520   1.234
2GLY    HA1    8   1.417   1.511   1.288
2GLY    HA2    9   1.334   1.573   1.141  
3SOL     OW   10   1.351   1.298   2.103
3SOL    HW1   11   1.375   1.395   2.102
3SOL    HW2   12   1.274   1.282   2.041
4SOL     OW   13   1.568   0.586   2.355
4SOL    HW1   14   1.643   0.623   2.410
4SOL    HW2   15   1.513   0.661   2.319
5SOL     OW   16   2.107   1.692   1.802
5SOL    HW1   17   2.064   1.627   1.740
5SOL    HW2   18   2.074   1.784   1.781 
and so on..

So here is my python code:

import re

k=0

F=open('abc.dat','r')
A=open('def.dat','w')

with open ('abc.dat') as F:
    for x in F:
        line=x.strip()

        if line.startswith("Generated by"):
            k=k+1

        if re.search('CA|OW', line):
            A.write(str(k) + '\t')

            for i in range(22,44):
                A.write(x[i])

            A.write('\n')

But I am getting the output(the x,y,z coordinates) for both the CA's and Ow's ie.,

1    1.204   1.158   1.219    
1    1.324   1.520   1.234  
1    1.351   1.298   2.103   
1    1.568   0.586   2.355

1) But I don't want the coordinates of 2GLY CA (ie.,1 1.324 1.520 1.234).

2) How can i multiply by 10 in the code (since the values are in nanometers and i want to convert into Angstroms) and get the output in Angstroms.

So, how do i fix this issue? Any suggestions are appreciated.

Upvotes: 1

Views: 315

Answers (1)

krock1516
krock1516

Reputation: 461

This is pretty simple, i have took your fiel as dhn.txt. You can achieve this as below..

with open('dhn.txt', mode='rt', encoding='utf-8') as f:
    for line in f:
        if line.startswith("1GLY"):
            if "CA" in line:
                print(line)

$ ./dhn.py
1GLY     CA    4   1.204   1.158   1.219

Second quick and dirty approach i see:

with open('dhn', mode='rt', encoding='utf-8') as f:
    for line in f:
        if (line.startswith("1GLY")) or ('OW' in line):
            if "CA" in line or "OW" in line:
                print(line)

results are as expected:

$ ./dhn.py
1GLY     CA    4   1.204   1.158   1.219

3SOL     OW   10   1.351   1.298   2.103

4SOL     OW   13   1.568   0.586   2.355

5SOL     OW   16   2.107   1.692   1.802

Upvotes: 1

Related Questions