Reputation: 16483
I have large file 9600x7000 pixel jpg file I am trying to see if I can do a edge detection. I tried loading the large (25Mb) file using:
from PIL import Image
image = Image.open("C:\\pathtofile\\test-tac.jpg")
image.show()
However python interpreter will crash. I am using Pycharm running Python 2.7.
So, I used a GDAL (used for large GEO refererencing files) to load the file. It will load the file into memory without any problem.
#reference http://www.gdal.org/gdal_tutorial.html
import gdal
from gdalconst import *
dataset = gdal.Open("C:\\pathtofile\\test-tac.jpg", GA_ReadOnly )
if dataset is None:
print "error loading file in gdal"
This will load file. However, I am trying to run following edge detection on it:
from matplotlib import pyplot as plt
from skimage import data
from skimage.feature import corner_harris, corner_subpix, corner_peaks
from skimage.transform import warp, AffineTransform
from skimage.draw import ellipse
# running corner Harris on the image object to detect image corners.
#(reference http://scikit-image.org/docs/dev/auto_examples/plot_corner.html)
coords = corner_peaks(corner_harris(image), min_distance=3) #5
coords_subpix = corner_subpix(image, coords, window_size=13)
plt.gray()
plt.imshow(image, interpolation='nearest')
plt.plot(coords[:, 1], coords[:, 0], '.b', markersize=9) # dots
plt.plot(coords_subpix[:, 1], coords_subpix[:, 0], '+r', markersize=15) # +
plt.plot(coords_subpix[:, 1][1], coords_subpix[:, 0][1], '*r', markersize=20) #X_Point1=Subpix[:,1][1], Y_Point1=Subpix[:,0][1]
N=len(coords_subpix[:,0])
labels = ['point{0}'.format(i) for i in range(N)]
#Label corners in image
for label, x, y in zip(labels, coords_subpix[:,1], coords_subpix[:,0]):
plt.annotate(label,
xy=(x,y), xytext = (-10,10),
textcoords = 'offset points', ha = 'right', va = 'bottom',
bbox = dict(boxstyle = 'round,pad=0.5', fc = 'yellow', alpha = 0.5),
arrowprops = dict(arrowstyle = '->', connectionstyle = 'arc3,rad=0'))
plt.axis((0, 9672, 7272, 0)) # (Y_start, Y_Stop, X_Stop, X_Start) ((0, 9672, 7272, 0))
plt.show()
This would work if I generate image using following code:
tform = AffineTransform(scale=(1.3, 1.1), rotation=1, shear=0.8,
translation=(210, 50))
image = warp(data.checkerboard(), tform.inverse, output_shape=(350, 350))
rr, cc = ellipse(310, 175, 10, 100)
image[rr, cc] = 1
image[180:230, 10:60] = 1
image[230:280, 60:110] = 1
My problem is I am not understanding Python much about the data format from the 'image' variable versus dataset variable generated by GDAL. My end goal is to be able to run edge detection on large (10000x7000) pixel jpg image using Python scikit-image library. If there is better way that GDAL to read large jpg images I am open to it.
If I set:
image=dataset
and run it, I get following error:
coords = corner_peaks(corner_harris(image), min_distance=3) #5
File "C:\Python27\lib\site-packages\skimage\feature\corner.py", line 171, in corner_harris
Axx, Axy, Ayy = _compute_auto_correlation(image, sigma)
File "C:\Python27\lib\site-packages\skimage\feature\corner.py", line 54, in _compute_auto_correlation
if image.ndim == 3:
AttributeError: 'Dataset' object has no attribute 'ndim'
This error message points that I am not understanding the datatype between dataset and image variables.
type(dataset)
Gives:
<class 'osgeo.gdal.Dataset'>
and
type(image)
Gives:
(350,350) float64.
For your large source file use: http://www.lib.utexas.edu/maps/tpc/txu-pclmaps-oclc-22834566_a-2c.jpg to give it a try.
Upvotes: 0
Views: 2478
Reputation: 7253
All scikit-image algorithms require Numpy arrays as input. You therefore need to convert your dataset variable to an ndarray. The easiest way to do this is to use the gdal plugin to read the file (or look at the plugin source -- it shows how to do the conversion).
Upvotes: 2
Reputation: 13539
import cv2
image = cv2.imread('txu-pclmaps-oclc-22834566_a-2c.jpg')
opencv can load the image without a problem. Although i suspect loading the image was not the problem. The first step of your algorithm was peeking at around 6 gigs of memory usage. So if you're not on a 64bit version of python it will crash. Also there looks to be something wrong with your code. When i tried to run it, it failed at the second function.
ValueError: operands could not be broadcast together with shapes (0,13) (13,13)
This was with the (8003, 10859, 3) image. I also tried with just one channel with the same error message.
Upvotes: 0