Jzl5325
Jzl5325

Reputation: 3974

Pythonic way to calculate offsets of an array

I am trying to calculate the origin and offset of variable size arrays and store them in a dictionary. Here is the likely non-pythonic way that I am achieving this. I am not sure if I should be looking to use map, a lambda function, or list comprehensions to make the code more pythonic.

Essentially, I need to cut chunks of an array up based on the total size and store the xstart, ystart, x_number_of_rows_to_read, y_number_of_columns_to_read in a dictionary. The total size is variable. I can not load the entire array into memory and use numpy indexing or I definitely would. The origin and offset are used to get the array into numpy.

intervalx = xsize / xsegment #Get the size of the chunks
intervaly = ysize / ysegment #Get the size of the chunks

#Setup to segment the image storing the start values and key into a dictionary.
xstart = 0
ystart = 0
key = 0

d = defaultdict(list)

for y in xrange(0, ysize, intervaly):
    if y + (intervaly * 2) < ysize:
        numberofrows = intervaly
    else:
        numberofrows = ysize - y

    for x in xrange(0, xsize, intervalx):
        if x + (intervalx * 2) < xsize:
            numberofcolumns = intervalx

        else:
            numberofcolumns = xsize - x
        l = [x,y,numberofcolumns, numberofrows]
        d[key].append(l)
        key += 1
return d

I realize that xrange is not ideal for a port to 3.

Upvotes: 6

Views: 2859

Answers (4)

mgilson
mgilson

Reputation: 309959

This code looks fine except for your use of defaultdict. A list seems like a much better data structure because:

  • Your keys are sequential
  • you are storing a list whose only element is another list in your dict.

One thing you could do:

  • use the ternary operator (I'm not sure if this would be an improvement, but it would be fewer lines of code)

Here's a modified version of your code with my few suggestions.

intervalx = xsize / xsegment #Get the size of the chunks
intervaly = ysize / ysegment #Get the size of the chunks

#Setup to segment the image storing the start values and key into a dictionary.
xstart = 0
ystart = 0

output = []

for y in xrange(0, ysize, intervaly):
    numberofrows = intervaly if y + (intervaly * 2) < ysize else ysize -y
    for x in xrange(0, xsize, intervalx):
        numberofcolumns = intervalx if x + (intervalx * 2) < xsize else xsize -x
        lst = [x, y, numberofcolumns, numberofrows]
        output.append(lst)

        #If it doesn't make any difference to your program, the above 2 lines could read:
        #tple = (x, y, numberofcolumns, numberofrows)
        #output.append(tple)

        #This will be slightly more efficient 
        #(tuple creation is faster than list creation)
        #and less memory hungry.  In other words, if it doesn't need to be a list due
        #to other constraints (e.g. you append to it later), you should make it a tuple.

Now to get your data, you can do offset_list=output[5] instead of offset_list=d[5][0]

Upvotes: 7

Marco de Wit
Marco de Wit

Reputation: 2804

This is a long one liner :

d = [(x,y,min(x+xinterval,xsize)-x,min(y+yinterval,ysize)-y) for x in 
xrange(0,xsize,xinterval) for y in xrange(0,ysize,yinterval)]

Upvotes: 0

JoshAdel
JoshAdel

Reputation: 68682

Have you considered using np.memmap to load the pieces dynamically instead? You would then just need to determine the offsets that you need on the fly rather than chunking the array storing the offsets.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html

Upvotes: 0

kamek
kamek

Reputation: 2440

Although it doesn't change your algorithm, a more pythonic way to write your if/else statements is:

numberofrows = intervaly if y + intervaly * 2 < ysize else ysize - y

instead of this:

if y + (intervaly * 2) < ysize:
    numberofrows = intervaly
else:
    numberofrows = ysize - y

(and similarly for the other if/else statement).

Upvotes: 0

Related Questions