Reputation: 171
This is my first post even though I've been reading SO for a while. I'm a Python beginner and I'd need your help. I'm processing a very big file (more than 2 million of lines) but I'll show you a much smaller example (24 lines rather than 74513). So let's say I've got 24 lines, each one with a floating point number, after that 3 numbers on the same line, then again 24 lines, line with 3 numbers and so on for 29 times.
56.71739
56.67950
56.65762
56.63320
56.61648
56.60323
56.63215
56.74365
56.98378
57.34681
57.78903
58.27959
58.81514
59.38853
59.98271
60.58515
-1.00000
56.09566
56.05496
56.02777
56.00158
55.98341
55.96830
55.99615
1 1 1
56.34692
56.70977
57.15187
57.64234
58.17782
58.75118
59.34534
59.94779
-1.00000
55.47366
55.42963
55.39739
55.36958
55.35020
55.33404
55.36098
55.47148
55.71110
56.07384
56.51588
57.00632
57.54180
58.11517
58.70937
2 1 1
It's quite easy to create an array with the first 24 lines:
import numpy
def ttarray_tms (traveltimes):
'''It defines the 3-D array, organized as I want.'''
with open (traveltimes, 'r') as file_in:
newarray = file_in.readlines()
ttarray = np.array(newarray)
ttarray.shape = (2,3,4)
ttarray = np.swapaxes(ttarray,1,2)
ttarray = np.swapaxes(ttarray,0,2)
return ttarray
PLEASE NOTE: There's no blank line between each number. It's a simple colon-vector file. For some reason I had to post like that. What I want is to basically get 29 arrays, so I should loop over the 24 lines and get an array, then loop again over the next 24 lines (jumping the line with 3 numbers, I don't really need them) and get another array and so on. I think my main problem is how to skip the line with the 3 numbers and start again a new loop for a new array.
Have you got any good idea?
Thanks very much!
Upvotes: 0
Views: 118
Reputation:
You can use readline()
to read a single line 24 times then use another readline()
to skip a line and so on.
With your code:
import numpy
def mk_array(elems):
'''Makes the nparray from an array of 24 numbers'''
ttarray = np.array(elems) # perhaps [ float(a) for a in elems ] is needed
ttarray.shape = (2,3,4)
ttarray = np.swapaxes(ttarray,1,2)
ttarray = np.swapaxes(ttarray,0,2)
return ttarray
def ttarray_tms(traveltimes):
'''It defines the 3-D array, organized as I want.'''
arrays = list()
with open (traveltimes, 'r') as file_in:
ret = "." # force the loop
while ret != "":
newarray = [ file_in.readline() for i in range(24) ]
ret = file_in.realine()
if ret != "": # avoid an empty array
ttarray = mk_array(newarray)
arrays.append(ttarray)
return arrays
Not tested.
Upvotes: 1
Reputation: 1659
The numbers in the three set line are following an incrementing pattern. So why don't you keep track of that pattern by keeping the last two numbers in two variables and if the three correspond to the pattern drop them and continue? It is kind of a sliding window approach.
Upvotes: 0