Reputation: 2335
Hi I have a function that read in data from two files. What I want to be happening is that the outer loop starts to read in the the first file and if lum from the first file is greater than the defined value (LC) the loop skips to the next iteration. If not the code moves to the inner loop which reads in the hlist and builds the x,y,z lists if idc and id are the same. If idc and id are not the same value it skips to the next iteration of the inner loop. When I print test statments It seems as if the inner loop is not iterating and its not clear to me why. I would appreciate any help
Code
def read_file(F): #Function that reads data froma file #and extracts specific data columns
X_pos = []
Y_pos = [] # Creats Data Lists
Z_pos = []
Idh = []
Id = []
LC = float(sys.argv[1])
N = 11#912639 # number of lines to be read
Nl = 11#896030
fl = open(Fl) #opens catalog file
fl.readline()
nlines_catalog = islice(fl, Nl)
f = open(F) #Opens hlist file
f.readline() # Strips Header
nlines_hlist = islice(f, N) #slices file to only read N lines
for linel in nlines_catalog:
if linel != '':
linel = linel.strip()
linel = linel.replace('\t', '')
columnsl = linel.split()
lum = float(columnsl[1])
id_catalog = int(columnsl[0])
if lum >= LC:
continue
print("lum1 =", lum)
#Id.append(idc)
print("id_catalog=", id_catalog)
for line in nlines_hlist:
if line != '':
line = line.strip()
line = line.replace('\t', ' ')
columns = line.split()
id_hlist = int(columns[1])
#Idh.append(id)
if id_hlist != id_ccatalog:
continue
print('idc =', idc, 'id =', id)
x = columns[17]
y = columns[18]
z = columns[19]
X_pos.append(x)
Y_pos.append(y) #appends data in list
Z_pos.append(z)
print(X)
X = [float(p) for p in X_pos]
Y = [float(p) for p in Y_pos]
Z = [float(p) for p in Z_pos]
Xa = numpy.array(X, dtype=float)
Ya = numpy.array(Y, dtype=float)
Za = numpy.array(Z, dtype=float)
return(Xa, Ya, Za)
EDIT
Changes to inner loop that allow it to reset and now work.
if id_catalog == id_halo:
print('id_catalog =',id_catalog,'id_halo =',id_halo)
x = columns[17] # assigns variable to columns
y = columns[18]
z = columns[19]
#vx = columns[]
#vy = columns[]
#vz = columns[]
X_pos.append(x)
Y_pos.append(y) #appends data in list
Z_pos.append(z)
break
Edit I could not reproduce my original out put from the print statements, and have edited to reflect that fact
VirtualBox:~$ python /home/Astrophysics/Count_FixedLoop.py -21.5 125
('lum1 =', -21.78545)
('idc=', 2701276876L)
('idc =', 2701276876L, 'id =', 2701276876L)
('lum1 =', -21.69835)
('idc=', 2699751347L)
('lum1 =', -21.69942)
('idc=', 2699724518L)
('lum1 =', -21.74543)
('idc=', 2699724331L)
('lum1 =', -21.60912)
('idc=', 2699724726L)
('lum1 =', -21.53862)
('idc=', 2699725014L)
('lum1 =', -21.53155)
('idc=', 2701277269L)
['34.57223']
This is what I expect as output
('lum1 =', -21.78545)
('idc=', 2701276876L)
('idc =', 2701276876L, 'id =', 2701276876L)
('lum1 =', -21.69835)
('idc=', 2699751347L)
('idc =', 2699751347L, 'id =', 2699751347L)
('lum1 =', -21.69942)
('idc=', 2699724518L)
('idc =', 2699724518L, 'id =', 2699724518L)
('lum1 =', -21.74543)
('idc=', 2699724331L)
('idc =', 2699724331L, 'id =', 2699724331L)
('lum1 =', -21.60912)
('idc=', 2699724726L)
('idc =', 2699724726L, 'id =', 2699724726L)
('lum1 =', -21.53862)
('idc=', 2699725014L)
('idc =', 2699725014L, 'id =', 2699725014L)
('lum1 =', -21.53155)
('idc=', 2701277269L)
Edit
Sample of a few lines of the Fl file where the idc numbers are column[0] and lum values are column[1]. I have put in bold print the idc values that meet the condition in the first loop.
Format: ID, scatter = 0 0.05 0.1 0.13 0.15 0.16 0.18 0.2 0.25 0.3
**2701276876 -21.78545** -21.73791 -21.68872 -21.11125 -20.88102 -22.04709 -21.41715 -20.56944 -20.36757 -19.69895
**2699751347 -21.69835** -21.67935 -21.92425 -21.03465 -21.56561 -21.42124 -21.72893 -20.78131 -20.76342 -20.34830
**2699724518 -21.69942** -21.58352 -21.71149 -21.16240 -21.18507 -22.00277 -21.81500 -20.36141 -20.78227 -20.65697
Edit
Sample lines of the F file with the id's and corresponding positions that match the first file
#Scale(0) Id(1) Desc_scale(2) Descid(3) Num_prog(4) Pid(5) Upid(6) Desc_pid(7) Phantom(8) Mvir(9) Orig_Mvir(10) Rvir(11) Rs(12) Vrms(13) Mmp(14) Last_mm(15) Vmax(16) X(17) Y(18) Z(19)
0.9523 **2701276876** 0.9583 2714557311 1 -1 -1 -1 0 3.56533e+13 3.56100e+13 695.459000 80.562000 548.820000 1 0.3603 561.490000 **34.57223 140.20813 130.81985** -110.000 323.430 -123.520 3.56533e+13 3.56533e+13 561.490000 599.410000 7.539e+14 -3.799e+12 -1.992e+14 0.10259
0.9523 **2699751347** 0.9583 2713034575 4 -1 -1 -1 0 3.36604e+13 3.31300e+13 678.981000 111.199000 500.400000 1 0.8083 514.010000 **28.70439 138.70247 138.52176** -215.310 252.520 -120.970 3.36604e+13 3.36604e+13 514.010000 599.250000 5.516e+14 1.044e+14 6.133e+14 0.10973
0.9523 **2699724518** 0.9583 2713007786 1 -1 -1 -1 0 2.98000e+13 2.97500e+13 654.997000 87.324000 457.460000 1 0.4863 514.660000 **8.01627 135.31783 123.13322** -178.990 558.900 1.250 2.98000e+13 2.98000e+13 514.660000 514.660000 8.529e+14 2.711e+14 -3.624e+14 0.15137
What I am expecting is that when the id's from both files match the second loop will append the X,Y, and Z position lists. So printing the list in this example will give
X = [34.57223,28.70439,8.01627]
Y = [140.20813,138.70247,135.31783]
Z = [130.81985,138.52176,123.13322]
Upvotes: 2
Views: 3950
Reputation: 4152
You seem to be expecting nlines
to behave like a list. However, it is instead an iterator, and as Ignacio pointed out above, it will be consumed once. In other words, the inner loop doesn't get "reset" to the first line/index on subsequent outer loop executions.
Consider the following analog (I think) to what you're doing. Here are two data files:
Data1:
file 1: one
file 1: two
file 1: three
file 1: four
file 1: five
Data2:
file 2: one
file 2: two
file 2: three
file 2: four
file 2: five
file 2: six
Running this:
from itertools import islice
f1 = open ("Data1")
f2 = open ("Data2")
iterator1 = islice (f1, 3)
iterator2 = islice (f2, 3)
for line1 in iterator1:
print line1
for line2 in iterator2:
print line2
results in:
file 1: one
file 2: one
file 2: two
file 2: three
file 1: two
file 1: three
whereas one might erroneously expect that 3 lines of the contents of data2
would be printed for each of the first 3 lines of data1
.
So, the first execution of the inner loop fully consumes iterator2
. In your own code there is no inner loop break when id == idc
- in other words, you consume interator nlines
completely the first time that inner loop executes.
See also, Python: itertools.islice not working in a loop for another example.
One solution may be to break in the inner loop when id == idc
, but this will assume (I think) an ordering of indices in your second file. You could consider actually using a list for the inner loop, although that seems memory-intensive given the size of your actual (non-test) data. You could obviously reread that second file, although performance will take a hit.
Upvotes: 2