saya
saya

Reputation: 29

reading xyz coordinates, formatting and creating a dictionary using python

I have following input.

   -1.93716260932213      4.07761284665160      0.00026114755225      o
    6.18617624849570      1.21557897823238      0.00149060336893      o
    2.08819081881417      2.59383844400838      0.00029402878682      n
    2.97257640904282     -1.65736881444699     -0.00056145022980      n
   -1.36088269076778     -0.37920984224593     -0.00050286871993      c
   -0.53339788798729      2.26822332595375     -0.00000341410519      c
    0.43736009141134     -2.19626465902310     -0.00100572484170      c
   -4.13480467711929     -0.88129495575000      0.00005233281548      c
    3.94803054683376      0.76762677032173      0.00037150755793      c
   -0.03940495969409     -4.20532755533682     -0.00126348348509      h
    2.71228553263687      4.40896687397411      0.00089118224220      h
    4.27812393785853     -3.05506184574341     -0.00070847092229      h
   -5.03899119562699     -0.01950727743747     -1.66429295994022      h
   -5.03815196825505     -0.01998122074952      1.66509909190865      h
   -4.53994759632051     -2.91783106840876     -0.00012152198798      h

I expect to get following output.

['o', 'n', 'n', 'c', 'c', 'c', 'c', 'c', 'h', 'h', 'h', 'h']
[[6.1861762484957, 1.21557897823238, 0.00149060336893], [2.08819081881417, 2.59383844400838, 0.00029402878682], [2.97257640904282, -1.65736881444699, -0.0005614502298], [-1.36088269076778, -0.37920984224593, -0.00050286871993], [-0.53339788798729, 2.26822332595375, -3.41410519e-06], [0.43736009141134, -2.1962646590231, -0.0010057248417], [-4.13480467711929, -0.88129495575, 5.233281548e-05], [3.94803054683376, 0.76762677032173, 0.00037150755793], [-0.03940495969409, -4.20532755533682, -0.00126348348509], [2.71228553263687, 4.40896687397411, 0.0008911822422], [4.27812393785853, -3.05506184574341, -0.00070847092229], [-5.03899119562699, -0.01950727743747, -1.66429295994022]]

I can get this by writing following code:

import re

with open("coord", "r") as input_file:
    lines = input_file.readlines()

atom_order = []
coords = []
for line in lines[1:-2]:
    line_split = re.split("\s+", line.strip())
    atom_order.append(line_split[-1])
    coords.append([float(val) for val in line_split[0:3]])

print(atom_order)
print(coords)

first_space = 4 * " "
first_space_neg = 3 * " "

space = 6 * " "
space_neg = 5 * " "

with open("test.out", "w") as output_file:
    for coord in coords:
        if coord[0] < 0:
            s1 = first_space_neg
        else:
            s1 = first_space

        if coord[1] < 0:
            s2 = space_neg
        else:
            s2 = space

        if coord[2] < 0:
            s3 = space_neg
        else:
            s3 = space

        output_file.write(s1 + f"{coord[0]:1.14f}" + s2 + f"{coord[1]:1.14f}" + s3 + f"{coord[2]:1.14f}" + "\n")

But this code breaks if there is a line at the beginning of the file with some characters such as hash or exclamation or dollar sign - for example

 ## - a commented line with hash Here is when it will break.
   -1.93716260932213      4.07761284665160      0.00026114755225      o
    6.18617624849570      1.21557897823238      0.00149060336893      o

So I am wondering if anyone could help me to get rid of this issue?

Upvotes: 0

Views: 138

Answers (1)

MattDMo
MattDMo

Reputation: 102852

Check if the first character of the line is one of !, $, or # and continue if it is:

for line in lines[1:-2]:
    if line[0] in "!$#":
        continue
    else:
        line_split = re.split("\s+", line.strip())
        atom_order.append(line_split[-1])
        coords.append([float(val) for val in line_split[0:3]])

Upvotes: 2

Related Questions