Reputation: 57
Hi I have a kNN implementation in Python and I am getting some syntax errors given below. The code is given later in the post.
Traceback (most recent call last):
File "C:\Users\user\Desktop\knn test\knn.py", line 76, in <module>
main()
File "C:\Users\user\Desktop\knn test\knn.py", line 63, in main
print ("Train set: ") + repr(len(trainingSet))
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
I am running Python 3. Can anyone tell me what to edit in the code so the I get the correct outputs?
import csv
import random
import math
import operator
def loadDataset(filename, split, trainingSet=[] , testSet=[]):
with open(filename, 'r') as csvfile:
lines = csv.reader(csvfile)
dataset = list(lines)
for x in range(len(dataset)-1):
for y in range(4):
dataset[x][y] = float(dataset[x][y])
if random.random() < split:
trainingSet.append(dataset[x])
else:
testSet.append(dataset[x])
def euclideanDistance(instance1, instance2, length):
distance = 0
for x in range(length):
distance += pow((instance1[x] - instance2[x]), 2)
return math.sqrt(distance)
def getNeighbors(trainingSet, testInstance, k):
distances = []
length = len(testInstance)-1
for x in range(len(trainingSet)):
dist = euclideanDistance(testInstance, trainingSet[x], length)
distances.append((trainingSet[x], dist))
distances.sort(key=operator.itemgetter(1))
neighbors = []
for x in range(k):
neighbors.append(distances[x][0])
return neighbors
def getResponse(neighbors):
classVotes = {}
for x in range(len(neighbors)):
response = neighbors[x][-1]
if response in classVotes:
classVotes[response] += 1
else:
classVotes[response] = 1
sortedVotes = sorted(classVotes.iteritems(), key=operator.itemgetter(1), reverse=True)
return sortedVotes[0][0]
def getAccuracy(testSet, predictions):
correct = 0
for x in range(len(testSet)):
if testSet[x][-1] == predictions[x]:
correct += 1
return (correct/float(len(testSet))) * 100.0
def main():
# prepare data
trainingSet=[]
testSet=[]
split = 0.67
loadDataset('C:/Users/user/Desktop/knn test/text.txt', split, trainingSet, testSet)
print ("Train set: ") + repr(len(trainingSet))
print ("Test set: ") + repr(len(testSet))
# generate predictions
predictions=[]
k = 3
for x in range(len(testSet)):
neighbors = getNeighbors(trainingSet, testSet[x], k)
result = getResponse(neighbors)
predictions.append(result)
print('> predicted=' + repr(result) + ', actual=' + repr(testSet[x][-1]))
accuracy = getAccuracy(testSet, predictions)
print('Accuracy: ' + repr(accuracy) + '%')
main()
Upvotes: 1
Views: 6521
Reputation: 26578
Your print statement is incorrect. If you are looking to concatenate strings for printing, you are not doing it correctly.
To take one of your print statements as an example:
print ("Train set: ") + repr(len(trainingSet))
Firstly, you do not need to take the repr
of the length of your trainingSet
. The repr
gives the string representation of an object. In your case, you are calling len(trainingSet)
. So you are actually getting back an integer. Technically, you can call repr
on this, but there really is no need to do this for what you are trying to achieve with just wanting to show the length of your structure.
Secondly, You are not setting this to your print statement properly, what you should is put your len(trainingSet)
inside your print function and use string formatting. So, you want this:
print ("Train set: {}".format(len(trainingSet)))
Upvotes: 1
Reputation: 11615
Check your print statements, your attempting to concatenate a print statement with a string.
Your print statements should be:
print("Train set: " + repr(len(trainingSet)))
print("Test set: " + repr(len(testSet)))
Upvotes: 1