boris
boris

Reputation: 398

Python - large GIS dataset plotting

I process repetitively large rasters (300~2500 Mb) in order to extract csv data (with more than 25.000 lines). Using regular GIS (ArcGIS, QGIS, Grass,...) to plot these is really painful and time confusing in regards of the size of the data and I am looking for an efficient method to just plot these points on the raster.

The code will use:

I wrote the first stages, the performance are not really high.

I am looking for suggestions/remarks to improve the performance of this kind of scripts :) (Cython, multi-processing, even another language if that worth it). I am really open-minded in the method, I am just not computer scientist ^^.

Many thanks,

Boris

PS: Here is the code, it is not finished yet but the overview is there:

import numpy as np
import gdal
import pyqtgraph as pg

print "import the bil file"
BilFile = gdal.Open("SRTM90_final_PP.bil")
print "extract data"
band = BilFile.GetRasterBand(1)
data = band.ReadAsArray()
print "Remove no data"

ndRowMin = -1
ndRowMax = -1
ndColMin = -1
ndColMax = 999999999

for i in range(data.shape[0]):
    for j in range (data.shape[1]):
        if((data[i][j] != -9999.) and (ndRowMin ==-1)):
            ndRowMin = i-1
        else:
            if((data[i][j] == -9999.) and (j==data.shape[1]-1)):
                ndRowMax = i
                break
        if((data[i][j] == -9999.) and (ndColMin+1 < j)):
            ndColMin = j-1
        if((data[i][j] == -9999.) and ndColMax > j):
            ndColMax = j

dataP = data[ndRowMin:ndRowMax][ndColMin:ndColMax]
print dataP

pg.image(dataP)
pg.QtGui.QApplication.exec_() 

Upvotes: 0

Views: 645

Answers (1)

Vincent J
Vincent J

Reputation: 5788

Depending on the input file format, Gdal may need to load the whole uncompressed image into memory, and then run very slowly.

After several tests on a large map with a need to extract small pieces of the map, I ended to convert my images into the uncompressed "TIFF" file format. TIFF appears to be efficiently read by Gdal libs (actually libtiff), e.g., the library only loads into memory the needed parts of the file from the disk.

Try to convert your file into TIFF or tell us more about the input file format you use as an input.

Edit: you answer that the line is "readarray()" is the slowest. That confirms that this is the loading of your file that is slow, not the cropping. Then you could only convert the file or make your own ENVI bil DEM file parser which does not load everything in RAM.

Upvotes: 1

Related Questions