user7269405
user7269405

Reputation: 59

Split up a text file based on its contents

I have a text file that looks like this:

SYSTEM

DOF=UY,UZ,RX  LENGTH=FT  FORCE=Kip

JOINT
  1  X=0  Y=-132.644  Z=0
  2  X=0  Y=-80  Z=0
  3  X=0  Y=-40  Z=0
  4  X=0  Y=0  Z=0
  5  X=0  Y=40  Z=0
  6  X=0  Y=80  Z=0
  7  X=0  Y=132.644  Z=0

etc. for 10,000 joints.

I would like to write a script that reads this text file and outputs 4 text files that are 1 column each containing the joint number, x coordinate, y coordinate, and z coordinate.

Is this possible? I am new to python and tried something like this but Python doesn't know what to do with System in the text file and I'm sure my method isn't correct:

os.chdir('/Users/DevEnv/')

with open('RawDataFile_445.txt') as a:
    for line in a.readlines():
        j=[]
        data=line.strip()
        data1=data.split(" ")
        for i in range(0,len(data1)):
            j.append(eval(data1[i]))
        joint=j

Upvotes: 2

Views: 102

Answers (2)

Jean-François Fabre
Jean-François Fabre

Reputation: 140287

Your solution seemed fragile:

  • you have to filter headers, non-data lines
  • you use eval() which is really overkill and unsafe.
  • using readlines() can be very memory-hungry: reads all data in memory, you don't need to do that. Just read one line at a time.
  • you miss the code to write back to the output files

My solution uses regular expressions to extract all data in one go. Comments in the code:

import re

# regex to extract data line    
r = re.compile(r"\s*(\d+)\s+X=(\S+)\s+Y=(\S+)\s+Z=(\S+)")

with open('RawDataFile_445.txt') as a:

    # open all 4 files with a meaningful name
    files=[open("file_{}.txt".format(x),"w") for x in ["J","X","Y","Z"]]
    for line in a:
        m = r.match(line)
        if m:
            # line matches: write in all 4 files (using zip to avoid doing
            # it one by one)
            for f,v in zip(files,m.groups()):
                f.write(v+"\n")

    # close all output files now that it's done
    for f in files:
        f.close()

you can test it by replacing the with open(...) as a: bit by:

a="""SYSTEM

DOF=UY,UZ,RX  LENGTH=FT  FORCE=Kip

JOINT
  1  X=0  Y=-132.644  Z=0
  2  X=0  Y=-80  Z=0
  3  X=0  Y=-40  Z=0
  4  X=0  Y=0  Z=0
  5  X=0  Y=40  Z=0
  6  X=0  Y=80  Z=0
  7  X=0  Y=132.644  Z=0""".splitlines().__iter__()

emulating the input file lines (that's how I do to answer questions with input files, to avoid creating input files on my system). You'll see the 4 files are created and filled up (don't forget the close part at the end of the code)

Upvotes: 2

rassar
rassar

Reputation: 5680

Try:

xlist, ylist, zlist = [], [], []
for line in file.readlines():
    try:
        if line.strip()[0].isdigit():
                s = line.split("=")
                x, y, z = (float(s[i][:-1].strip()) for i in range(1, 4))
                xlist.append(x)
                ylist.append(y)
                zlist.append(z)
    except:
        pass

This will do it for every line that starts with a number. Note: without empty lines, you have no need for the try statement.

Upvotes: 0

Related Questions