Camerann
Camerann

Reputation: 35

Reading file with variable columns

I am trying to read in an xyz file into python but keep getting these error messages. Kinda new to python so would love some help interpreting it!

def main():
    atoms = []
    coordinates = []
    name = input("Enter filename: ")
    xyz = open(name, 'r')
    n_atoms = xyz.readline()
    title = xyz.readline()
    for line in xyz:
        atom, x, y, z = line.split()
        atoms.append(atom)
        coordinates.append([float(x), float(y), float(z)])
    xyz.close()

    return atoms, coordinates


if __name__ == '__main__':
    main()

Error:
Traceback (most recent call last):
  File "Project1.py", line 25, in <module>
    main()
  File "Project1.py", line 16, in main
    atom, x, y, z = line.split()
ValueError: not enough values to unpack (expected 4, got 3)

I believe the value error is because after a couple of lines there are only 3 values. But not sure why I am getting return errors.

Upvotes: 2

Views: 5327

Answers (2)

mcocdawc
mcocdawc

Reputation: 1867

One very important rule of thumb especially in python is: Don't reinvent the wheel and use existing libraries.

The xyz files are one of the few universally normed file formats in chemistry. So IMHO you don't need any logic to determine the length of your line. The first line is an integer n_atoms and gives you the number of atoms, the second line is an ignored comment line and the next n_atoms lines are [string, float, float, float] as you have already written in your code. A file that deviates from this, is probably corrupted.

Using the pandas library you can simply write:

import pandas as pd
molecule = pd.read_table(inputfile, skiprows=2, delim_whitespace=True,
                         names=['atom', 'x', 'y', 'z'])

Or you use the chemcoord package which has its own Cartesian class representing molecules in cartesian coordinates:

import chemcoord as cc
molecule = cc.Cartesian.read_xyz(inputfile)

Disclaimer: I am the author of chemcoord.

Upvotes: 4

phihag
phihag

Reputation: 287835

You are getting errors because you unpack a list in the line

atom, x, y, z = line.split()

This only makes sense if there are 4 items in the line.

You have to define logic of what happens when there are only 3 items in a line, like this (within the for loop):

for line in xyz:
    line_data = line.split()
    if len(line_data) == 3:
         # Behavior when only 3 items in a line goes here!
         # Add your code here!
         continue

    atom, x, y, z = line_data
    atoms.append(atom)
    coordinates.append([float(x), float(y), float(z)])

What your program does when it encounters a line with only 3 items depends on what you want it to.

Upvotes: 2

Related Questions