Using Continue Statement in nested for loop in Python

Question

Hi I have a function that read in data from two files. What I want to be happening is that the outer loop starts to read in the the first file and if lum from the first file is greater than the defined value (LC) the loop skips to the next iteration. If not the code moves to the inner loop which reads in the hlist and builds the x,y,z lists if idc and id are the same. If idc and id are not the same value it skips to the next iteration of the inner loop. When I print test statments It seems as if the inner loop is not iterating and its not clear to me why. I would appreciate any help

Code

def read_file(F):      #Function that reads data froma file #and extracts specific data columns
    X_pos = []
    Y_pos = []                # Creats Data Lists
    Z_pos = []
    Idh = []
    Id = []
    LC = float(sys.argv[1]) 
    N = 11#912639            # number of lines to be read
    Nl = 11#896030
    fl = open(Fl)            #opens catalog file
    fl.readline()
    nlines_catalog = islice(fl, Nl)
    f = open(F)           #Opens hlist file
    f.readline()          # Strips Header
    nlines_hlist = islice(f, N) #slices file to only read N lines

    for linel in nlines_catalog:
            if linel != '':
              linel = linel.strip()              
              linel = linel.replace('	', '')
              columnsl = linel.split()
              lum = float(columnsl[1])
              id_catalog = int(columnsl[0])
              if lum >= LC:
                 continue
              print("lum1 =", lum)
              #Id.append(idc)
              print("id_catalog=", id_catalog)
              for line in nlines_hlist:
                 if line != '':
                    line = line.strip()
                    line = line.replace('	', ' ')
                    columns = line.split()
                    id_hlist = int(columns[1])
                    #Idh.append(id)
                     if id_hlist != id_ccatalog:
                        continue
                    print('idc =', idc, 'id =', id)
                    x = columns[17]           
                    y = columns[18]
                    z = columns[19]
                    X_pos.append(x)
                    Y_pos.append(y)                #appends data in list
                    Z_pos.append(z)

    print(X)
    X = [float(p) for p in X_pos]        
    Y = [float(p) for p in Y_pos]
    Z = [float(p) for p in Z_pos]
    Xa = numpy.array(X, dtype=float)
    Ya = numpy.array(Y, dtype=float)
    Za = numpy.array(Z, dtype=float)


    return(Xa, Ya, Za)

EDIT

Changes to inner loop that allow it to reset and now work.

 if id_catalog ==  id_halo:
      print('id_catalog =',id_catalog,'id_halo =',id_halo)
      x = columns[17]             # assigns variable to columns
      y = columns[18]
      z = columns[19]
      #vx = columns[]
      #vy = columns[]
      #vz = columns[]
      X_pos.append(x)
      Y_pos.append(y)                #appends data in list
      Z_pos.append(z)
      break

Edit I could not reproduce my original out put from the print statements, and have edited to reflect that fact

VirtualBox:~$ python /home/Astrophysics/Count_FixedLoop.py -21.5 125
('lum1 =', -21.78545)
('idc=', 2701276876L)
('idc =', 2701276876L, 'id =', 2701276876L)
('lum1 =', -21.69835)
('idc=', 2699751347L)
('lum1 =', -21.69942)
('idc=', 2699724518L)
('lum1 =', -21.74543)
('idc=', 2699724331L)
('lum1 =', -21.60912)
('idc=', 2699724726L)
('lum1 =', -21.53862)
('idc=', 2699725014L)
('lum1 =', -21.53155)
('idc=', 2701277269L)
['34.57223']

This is what I expect as output

 ('lum1 =', -21.78545)
 ('idc=', 2701276876L)
 ('idc =', 2701276876L, 'id =', 2701276876L)
 ('lum1 =', -21.69835)
 ('idc=', 2699751347L)
 ('idc =', 2699751347L, 'id =', 2699751347L)
 ('lum1 =', -21.69942)
 ('idc=', 2699724518L)
 ('idc =', 2699724518L, 'id =', 2699724518L)
 ('lum1 =', -21.74543)
 ('idc=', 2699724331L)
 ('idc =', 2699724331L, 'id =', 2699724331L)
 ('lum1 =', -21.60912)
 ('idc=', 2699724726L)
 ('idc =', 2699724726L, 'id =', 2699724726L)
 ('lum1 =', -21.53862)
 ('idc=', 2699725014L)
 ('idc =', 2699725014L, 'id =', 2699725014L)
 ('lum1 =', -21.53155)
 ('idc=', 2701277269L)

Edit

Sample of a few lines of the Fl file where the idc numbers are column[0] and lum values are column[1]. I have put in bold print the idc values that meet the condition in the first loop.

Format: ID, scatter = 0 0.05 0.1 0.13 0.15 0.16 0.18 0.2 0.25 0.3
**2701276876 -21.78545** -21.73791 -21.68872 -21.11125 -20.88102 -22.04709 -21.41715       -20.56944 -20.36757 -19.69895
**2699751347 -21.69835** -21.67935 -21.92425 -21.03465 -21.56561 -21.42124 -21.72893 -20.78131 -20.76342 -20.34830
**2699724518 -21.69942** -21.58352 -21.71149 -21.16240 -21.18507 -22.00277 -21.81500 -20.36141 -20.78227 -20.65697

Edit

Sample lines of the F file with the id's and corresponding positions that match the first file

#Scale(0) Id(1) Desc_scale(2) Descid(3) Num_prog(4) Pid(5) Upid(6) Desc_pid(7)   Phantom(8) Mvir(9) Orig_Mvir(10) Rvir(11) Rs(12) Vrms(13) Mmp(14) Last_mm(15) Vmax(16) X(17) Y(18) Z(19) 
 0.9523 **2701276876** 0.9583 2714557311      1       -1       -1       -1  0     3.56533e+13 3.56100e+13 695.459000 80.562000 548.820000  1 0.3603 561.490000 **34.57223 140.20813 130.81985** -110.000 323.430 -123.520 3.56533e+13 3.56533e+13 561.490000 599.410000 7.539e+14 -3.799e+12 -1.992e+14 0.10259
0.9523 **2699751347** 0.9583 2713034575      4       -1       -1       -1  0 3.36604e+13 3.31300e+13 678.981000 111.199000 500.400000  1 0.8083 514.010000 **28.70439 138.70247 138.52176** -215.310 252.520 -120.970 3.36604e+13 3.36604e+13 514.010000 599.250000 5.516e+14 1.044e+14 6.133e+14 0.10973
0.9523 **2699724518** 0.9583 2713007786      1       -1       -1       -1  0 2.98000e+13 2.97500e+13 654.997000 87.324000 457.460000  1 0.4863 514.660000 **8.01627 135.31783 123.13322** -178.990 558.900 1.250 2.98000e+13 2.98000e+13 514.660000 514.660000 8.529e+14 2.711e+14 -3.624e+14 0.15137

What I am expecting is that when the id's from both files match the second loop will append the X,Y, and Z position lists. So printing the list in this example will give

X = [34.57223,28.70439,8.01627]
Y = [140.20813,138.70247,135.31783]
Z = [130.81985,138.52176,123.13322]

Keith Flower · Accepted Answer

You seem to be expecting nlines to behave like a list. However, it is instead an iterator, and as Ignacio pointed out above, it will be consumed once. In other words, the inner loop doesn't get "reset" to the first line/index on subsequent outer loop executions.

Consider the following analog (I think) to what you're doing. Here are two data files:

Data1:

file 1: one
file 1: two
file 1: three
file 1: four
file 1: five

Data2:

file 2: one
file 2: two
file 2: three
file 2: four
file 2: five
file 2: six

Running this:

from itertools import islice

f1 = open ("Data1")
f2 = open ("Data2")

iterator1 = islice (f1, 3)
iterator2 = islice (f2, 3)

for line1 in iterator1:
    print line1

    for line2 in iterator2:
       print line2

results in:

file 1: one
file 2: one
file 2: two
file 2: three
file 1: two
file 1: three

whereas one might erroneously expect that 3 lines of the contents of data2 would be printed for each of the first 3 lines of data1.

So, the first execution of the inner loop fully consumes iterator2. In your own code there is no inner loop break when id == idc - in other words, you consume interator nlines completely the first time that inner loop executes.

See also, Python: itertools.islice not working in a loop for another example.

One solution may be to break in the inner loop when id == idc, but this will assume (I think) an ordering of indices in your second file. You could consider actually using a list for the inner loop, although that seems memory-intensive given the size of your actual (non-test) data. You could obviously reread that second file, although performance will take a hit.

Using Continue Statement in nested for loop in Python

Answers (1)

Related Questions