BenP
BenP

Reputation: 845

Python: slow for loop performance on reading, extracting and writing from a list of thousands of files

I am extracting 150 different cell values from 350,000 (20kb) ascii raster files. My current code is fine for processing the 150 cell values from 100's of the ascii files, however it is very slow when running on the full data set.

I am still learning python so are there any obvious inefficiencies? or suggestions to improve the below code.

I have tried closing the 'dat' file in the 2nd function; no improvement.

   dat = None

First: I have a function which returns the row and column locations from a cartesian grid.

def world2Pixel(gt, x, y):
  ulX = gt[0]
  ulY = gt[3]
  xDist = gt[1]
  yDist = gt[5]
  rtnX = gt[2]
  rtnY = gt[4]
  pixel = int((x - ulX) / xDist)
  line = int((ulY - y) / xDist)
  return (pixel, line)

Second: A function to which I pass lists of 150 'id','x' and 'y' values in a for loop. The first function is called within and used to extract the cell value which is appended to a new list. I also have a list of files 'asc_list' and corresponding times in 'date_list'. Please ignore count / enumerate as I use this later; unless it is impeding efficiency.

def asc2series(id, x, y):
#count = 1
ls_id = []
ls_p = []
ls_d = []
for n, (asc,date) in enumerate(zip(asc, date_list)):
    dat = gdal.Open(asc_list)
    gt = dat.GetGeoTransform()
    pixel, line = world2Pixel(gt, east, nort)
    band = dat.GetRasterBand(1)
    #dat = None
    value = band.ReadAsArray(pixel, line, 1, 1)[0, 0]
    ls_id.append(id)
    ls_p.append(value)
    ls_d.append(date)

Many thanks

Upvotes: 0

Views: 540

Answers (1)

hruske
hruske

Reputation: 2253

  1. In world2pixel you are setting rtnX and rtnY which you don't use.
  2. You probably meant gdal.Open(asc) -- not asc_list.
  3. You could move gt = dat.GetGeoTransform() out of the loop. (Rereading made me realize you can't really.)
  4. You could cache calls to world2Pixel.
  5. You're opening dat file for each pixel -- you should probably turn the logic around to only open files once and lookup all the pixels mapped to this file.
  6. Benchmark, check the links in this podcast to see how: http://talkpython.fm/episodes/show/28/making-python-fast-profiling-python-code

Upvotes: 1

Related Questions