Cammen
Cammen

Reputation: 31

Python: How to extract string from text file to use as data

this is my first time writing a python script and I'm having some trouble getting started. Let's say I have a txt file named Test.txt that contains this information.

                                   x          y          z      Type of atom
ATOM   1     C1  GLN D  10      26.395      3.904      4.923    C
ATOM   2     O1  GLN D  10      26.431      2.638      5.002    O
ATOM   3     O2  GLN D  10      26.085      4.471      3.796    O 
ATOM   4     C2  GLN D  10      26.642      4.743      6.148    C  

What I want to do is eventually write a script that will find the center of mass of these three atoms. So basically I want to sum up all of the x values in that txt file with each number multiplied by a given value depending on the type of atom.

I know I need to define the positions for each x-value, but I'm having trouble with figuring out how to make these x-values be represented as numbers instead of txt from a string. I have to keep in mind that I'll need to multiply these numbers by the type of atom, so I need a way to keep them defined for each atom type. Can anyone push me in the right direction?

Upvotes: 3

Views: 11249

Answers (3)

Chang She
Chang She

Reputation: 16970

If you have pandas installed, checkout the read_fwf function that imports a fixed-width file and creates a DataFrame (2-d tabular data structure). It'll save you lines of code on import and also give you a lot of data munging functionality if you want to do any additional data manipulations.

Upvotes: 0

ely
ely

Reputation: 77404

mass_dictionary = {'C':12.0107,
                   'O':15.999
                   #Others...?
                  }

# If your files are this structured, you can just
# hardcode some column assumptions.
coords_idxs = [6,7,8]
type_idx = 9

# Open file, get lines, close file.
# Probably prudent to add try-except here for bad file names.
f_open = open("Test.txt",'r')
lines = f_open.readlines()
f_open.close()

# Initialize an array to hold needed intermediate data.
output_coms = []; total_mass = 0.0;

# Loop through the lines of the file.
for line in lines:

    # Split the line on white space.
    line_stuff = line.split()

    # If the line is empty or fails to start with 'ATOM', skip it.
    if (not line_stuff) or (not line_stuff[0]=='ATOM'):
        pass

    # Otherwise, append the mass-weighted coordinates to a list and increment total mass.
    else:
        output_coms.append([mass_dictionary[line_stuff[type_idx]]*float(line_stuff[i]) for i in coords_idxs])
        total_mass = total_mass + mass_dictionary[line_stuff[type_idx]]

# After getting all the data, finish off the averages.
avg_x, avg_y, avg_z = tuple(map( lambda x: (1.0/total_mass)*sum(x), [[elem[i] for elem in output_coms] for i in [0,1,2]]))


# A lot of this will be better with NumPy arrays if you'll be using this often or on
# larger files. Python Pandas might be an even better option if you want to just
# store the file data and play with it in Python.

Upvotes: 1

Florin Stingaciu
Florin Stingaciu

Reputation: 8275

Basically using the open function in python you can open any file. So you can do something as follows: --- the following snippet is not a solution to the whole problem but an approach.

def read_file():
    f = open("filename", 'r')
    for line in f:
        line_list = line.split()
        ....
        ....
    f.close()

From this point on you have a nice setup of what you can do with these values. Basically the second line just opens the file for reading. The third line define a for loop that reads the file one line at a time and each line goes into the line variable.

The last line in that snippet basically breaks the string --at every whitepsace -- into an list. So line_list[0] will be the value on your first column and so forth. From this point if you have any programming experience you can just use if statements and such to get the logic that you want.

** Also keep in mind that the type of values stored in that list will all be string so if you want to perform any arithmetic operations such as adding you have to be careful.

* Edited for syntax correction

Upvotes: 0

Related Questions