wigging
wigging

Reputation: 9190

Python write() function is writing previous data to next file

I am reading text from a text file then reformatting that text to write to different text files.

The text that I am reading is the following, testFile.txt:

                  *******************************
                  *  Void Fractions in the Bed  *
                  *******************************

     Z(m)    MIN.FLUIDIZ.  EMULSION    TOTAL

0.0000E+00  0.4151E+00  0.8233E+00  0.8233E+00
0.1000E-09  0.4151E+00  0.8233E+00  0.8233E+00
0.1000E-05  0.4151E+00  0.8233E+00  0.8233E+00
0.2000E-05  0.4151E+00  0.8233E+00  0.8233E+00
0.1251E+01  0.4151E+00  0.9152E+00  0.9152E+00
0.1301E+01  0.4151E+00  0.9152E+00  0.9152E+00
0.1333E+01  0.4151E+00  0.9152E+00  0.9152E+00


               *************************************
               *  Void Fractions in the Freeboard  *
               *************************************

     Z(m)    VOID FRACTION

0.1333E+01  0.9992E+00
0.1333E+01  0.9992E+00
0.1333E+01  0.9992E+00
0.1333E+01  0.9992E+00
0.3533E+01  0.9992E+00
0.3633E+01  0.9992E+00
0.3733E+01  0.9992E+00
0.3833E+01  0.9992E+00
0.3933E+01  0.9992E+00
0.4000E+01  0.9992E+00


           *********************************************
           *  Superficial Velocities in the Bed (m/s)  *
           *********************************************

     Z(m)    MIN.FLUIDIZ.  ACTUAL

0.0000E+00  0.1235E+00  0.4911E+01
0.1000E-09  0.1235E+00  0.4911E+01
0.1000E-05  0.1235E+00  0.4911E+01
0.2000E-05  0.1235E+00  0.4911E+01
0.3000E-05  0.1235E+00  0.4911E+01
0.1151E+01  0.1235E+00  0.4915E+01
0.1201E+01  0.1235E+00  0.4915E+01
0.1251E+01  0.1235E+00  0.4915E+01
0.1301E+01  0.1235E+00  0.4915E+01
0.1333E+01  0.1235E+00  0.4915E+01

Below is my Python code to parse the text file:

openFile = open('testFile.txt','r')

groupOneFile = open('groupOneFile.csv','w')
groupTwoFile = open('groupTwoFile.csv','w')
groupThreeFile = open('groupThreeFile.csv','w')

idx = 0;
firstIdx = 0;
secondIdx = 0;
thirdIdx = 0;

for line in openFile:

    # first group
    if '*  Void Fractions in the Bed  *' in line:
        print line
        firstIdx = idx

    if idx in range(firstIdx+5,firstIdx+43):
        line = line.lstrip()
        line = line.replace('  ',',')
        groupOneFile.write(line)

    # second group
    if '*  Void Fractions in the Freeboard  *' in line:
        print line
        secondIdx = idx

    if idx in range(secondIdx+5,secondIdx+43):
        line = line.lstrip()
        line = line.replace('  ',',')
        groupTwoFile.write(line)        

    # third group
    if '*  Superficial Velocities in the Bed (m/s)  *' in line:
        print line
        thirdIdx = idx

    if idx in range(thirdIdx+5,thirdIdx+43):
        line = line.lstrip()
        line = line.replace('  ',',')
        groupThreeFile.write(line)

    idx += 1

openFile.close()

groupOneFile.close()
groupTwoFile.close()
groupThreeFile.close()

The groupOneFile should have the following data in it:

0.0000E+00,0.4151E+00,0.8233E+00,0.8233E+00
0.1000E-09,0.4151E+00,0.8233E+00,0.8233E+00
0.1000E-05,0.4151E+00,0.8233E+00,0.8233E+00
0.2000E-05,0.4151E+00,0.8233E+00,0.8233E+00
0.1251E+01,0.4151E+00,0.9152E+00,0.9152E+00
0.1301E+01,0.4151E+00,0.9152E+00,0.9152E+00
0.1333E+01,0.4151E+00,0.9152E+00,0.9152E+00

The groupTwoFile should have the following:

0.1333E+01,0.9992E+00
0.1333E+01,0.9992E+00
0.1333E+01,0.9992E+00
0.1333E+01,0.9992E+00
0.3533E+01,0.9992E+00
0.3633E+01,0.9992E+00
0.3733E+01,0.9992E+00
0.3833E+01,0.9992E+00
0.3933E+01,0.9992E+00
0.4000E+01,0.9992E+00

And so on for the groupThreeFile.

The reading of the main text file and writing the data to the other files is working fine. The problem is the data that is written to the groupOneFile is also being written to the beginning of the other files groupTwoFile and groupThreeFile. How can I prevent this from happening?

Upvotes: 1

Views: 173

Answers (3)

6502
6502

Reputation: 114579

To get that working you could just initialize

firstIdx = 1000000
secondIdx = 1000000
thirdIdx = 1000000

because the problem is that if you set them to 0 then the first lines will be in range for all groups.

Note however that this code is very inefficient... a better approach could be:

outputFile = None

for line in openFile:
    if '*  Void Fractions in the Bed  *' in line:
        idx = 0; outputFile = groupOneFile
    elif '*  Void Fractions in the Freeboard  *' in line:
        idx = 0; outputFile = groupTwoFile
    elif '*  Superficial Velocities in the Bed (m/s)  *' in line:
        idx = 0; outputFile = groupThreeFile

    if outputFile and 5 <= idx < 43:
        line = line.lstrip()
        line = line.replace('  ',',')
        outputFile.write(line)

    idx = idx + 1

In Python if you write if x in range(a, b): a check for every element is done (or in Python 2.x an actual list of all integers from a to b-1 is built) each time you do the test. Much better is to write the test as if a <= x < b:.

Note also that 2.5 in range(0, 10) would return false (while of course 0 <= 2.5 < 10 is true).

In Python there is no switch statement, but you can build a dispatching table instead:

filemap = [('*  Void Fractions in the Bed  *', groupOneFile),
           ('*  Void Fractions in the Freeboard  *', groupTwoFile),
           ('*  Superficial Velocities in the Bed (m/s)  *', groupThreeFile)]

outputFile = None
for line in openFile:
    for tag, file in filemap:
        if tag in line:
            idx = 0
            outputFile = file
    if outputFile and 5 <= idx < 43:
        outputFile.write(line)
    idx += 1

if exact match is possible (instead of in testing) this can be made even better using a dictionary:

filemap = {'*  Void Fractions in the Bed  *': groupOneFile,
           '*  Void Fractions in the Freeboard  *': groupTwoFile,
           '*  Superficial Velocities in the Bed (m/s)  *': groupThreeFile)}

outputFile = None
for line in openFile:
    f = filemap.get(line.strip())
    if f:
        # Found a new group header, switch output file
        idx = 0
        outputFile = f
    if outputFile and 5 <= idx < 43:
        outputFile.write(line)
    idx += 1

Upvotes: 1

John La Rooy
John La Rooy

Reputation: 304375

You asked for my suggestion, so here it is

from itertools import groupby, product

groups = {'*  Void Fractions in the Bed  *': 'groupOneFile.csv',
          '*  Void Fractions in the Freeboard  *': 'groupTwoFile.csv',
          '*  Superficial Velocities in the Bed (m/s)  *': 'groupThreeFile.csv'}

fname = None

with open('testFile.txt','r') as fin:
    for k, group in groupby(fin, lambda x:x[0].isspace()):
        if k:
            for i, g in product(group, groups):
                if g in i:
                    fname = groups[g]
                    break
        else:
            with open(fname, 'w') as fout:
                fout.writelines(','.join(s.split())+'\n' for s in group)

Upvotes: 1

Brandon Humpert
Brandon Humpert

Reputation: 332

secondIdx and thirdIdx begin at 0, which means that if idx in range(secondIdx+5,secondIdx+43): is triggering on lines that are close to the top of the file.

To fix this, you could either rewrite to a more stateful setup (when you read Void Fractions in the Bed, you write to the first file until you find a new heading, etc.) or simply initialize your Idxs to -100 or so.

Upvotes: 0

Related Questions