Reputation: 21914
I'm having the following issue some python code that I'm running. It should just be iterating through a list, but it seems to be doing something strange and subtle that I honestly can't figure out.
from skimage.io import imread
def addImageData(self):
for image in self.images:
print image.signatureId
for image in self.images:
print image.signatureId
imageNumber = str(image.signatureId).zfill(4)
filePath = self.imageDirectory + imageNumber + ".jpg"
image.construct(filePath)
def construct(self, filePath):
self.imageData = imread(filePath, as_grey=True)
where imread is from skimage.io. The first for loop under the addImageData works perfectly, printing out a series of numbers ranging from 1 to ~600. The second loop however, when the construct method is added, simply prints the number 1 until hitting a memory error. I'm quite honestly at a loss as to what's causing this. Thoughts?
When using a keyboard interrupt this is the traceback:
File "rbm.py", line 22, in buildImages
self.addImageData()
File "rbm.py", line 41, in addImageData
image.construct(filePath)
File "rbm.py", line 61, in construct
self.imageData = imread(filePath, as_grey=True)
File "/usr/local/lib/python2.7/dist-packages/scikit_image-0.8.2-py2.7-linux-i686.egg/skimage/io/_io.py", line 142, in imread
img = rgb2grey(img)
File "/usr/local/lib/python2.7/dist-packages/scikit_image-0.8.2-py2.7-linux-i686.egg/skimage/color/colorconv.py", line 540, in rgb2gray
return _convert(gray_from_rgb, rgb[:, :, :3])[..., 0]
File "/usr/local/lib/python2.7/dist-packages/scikit_image-0.8.2-py2.7-linux-i686.egg/skimage/color/colorconv.py", line 339, in _convert
out = np.dot(matrix, arr)
Adding all code relevant to self.images below:
class TrainingImages:
def __init__(self, csvFile = "../train.csv", imageDirectory = "../images/"):
self.csvFile = csvFile
self.imageDirectory = imageDirectory
self.images = []
def appendCsvLine(self, line):
'''Assumes the line is from a csv.reader object'''
signatureId = line[1]
if len(self.images) <= signatureId:
newImage = Image(signatureId)
self.images.append(newImage)
newImage.append(line)
else:
self.images[(signatureId-1)].append(line)
def buildImages(self):
with open(self.csvFile, 'rb') as strokeData:
reader = csv.reader(strokeData, delimiter=",")
for line in reader:
self.appendCsvLine(line)
self.addImageData()
Upvotes: 0
Views: 311
Reputation: 21914
Thanks for all the comments guys, they were very helpful in figuring this out, but when it was all said and done it was a pretty strange error, but I discovered the source and would like to share it.
In the function appendCsvLine I was apparently comparing a string to an integer. The result from the csv.reader class is always a string regardless of the object actually within the entry. My implicit assumption was that were I doing something as silly as comparing a string and an integer python would throw a valueError. Apparently this is not the case.
def appendCsvLine(self, line):
'''Assumes the line is from a csv.reader object'''
signatureId = int(line[1])
if len(self.images) <= signatureId:
newImage = Image(signatureId)
self.images.append(newImage)
newImage.append(line)
else:
self.images[(signatureId-1)].append(line)
This incredibly small change fixed my code and it was a very difficult bug to track down and find. This issue can probably best be explained through the following code snippet:
>>> "100" > 99999999999999999999999
True
As far as the method by which I noticed this issue, I first implemented EOL's suggestion of adding the line print [img.signatureID for img in self.images]
to my code. I found that it printed out a long array with a huge amount of 1s followed by a huge amount of 2s, followed by a huge amount of 3s etc.
I then started to look at the piece of code where the images were actually constructed and put simple print lines under both the if and the else of the appendCsvLine function. I realized that the program was never reaching the else statement and from there I tested the output of the if statement and then realized that the explicit casting of the signatureId to an integer resolve the problem. Then after running some tests in the shell with csvl.reader and comparing strings and integers in python I realized my mistake.
Upvotes: 1