Reputation: 209
I get the following error:
TypeError Traceback (most recent call last)
C:\Users\levanim\Desktop\Levani Predictive\cosinesimilarity1.py in <module>()
39
40 for i in meowmix_nearest_neighbors.index:
---> 41 top_ten = pd.DataFrame(similarity_matrix.ix[i,]).sort([i],
ascending=False[1:6]).index.values
42 meowmix_nearest_neighbors.ix[i,:] = top_ten
43
TypeError: 'bool' object is not subscriptable
for the following code. I'm new to Python and can't quite put my finger on how I have to change the syntax(if its a syntax python 3 problem). Anybody encounter this? I think it's to do with the ascending=False[1:6] portion and have spent some time banging my head against the wall. Hoping it's a simple fix but don't know enough
import numpy as np
import pandas as pd
from scipy.spatial.distance import cosine
enrollments = pd.read_csv(r'C:\Users\levanim\Desktop\Levani
Predictive\smallsample.csv')
meowmix = enrollments.fillna(0)
meowmix.ix[0:5,0:5]
def getCosine(x,y) :
cosine = np.sum(x*y) / (np.sqrt(np.sum(x*x)) * np.sqrt(np.sum(y*y)))
return cosine
print("done creating cosine function")
similarity_matrix = pd.DataFrame(index=meowmix.columns,
columns=meowmix.columns)
similarity_matrix = similarity_matrix.fillna(np.nan)
similarity_matrix.ix[0:5,0:5]
print("done creating a matrix placeholder")
for i in similarity_matrix.columns:
for j in similarity_matrix.columns:
similarity_matrix.ix[i,j] = getCosine(meowmix[i].values,
meowmix[j].values)
print("done looping through each column and filling in placeholder with
cosine similarities")
meowmix_nearest_neighbors = pd.DataFrame(index=meowmix.columns,
columns=['top_'+str(i+1) for i in
range(5)])
meowmix_nearest_neighbors = meowmix_nearest_neighbors.fillna(np.nan)
print("done creating a nearest neighbor placeholder for each item")
for i in meowmix_nearest_neighbors.index:
top_ten = pd.DataFrame(similarity_matrix.ix[i,]).sort([i],
ascending=False[1:6]).index.values
meowmix_nearest_neighbors.ix[i,:] = top_ten
print("done creating the top 5 neighbors for each item")
meowmix_nearest_neighbors.head()
Upvotes: 7
Views: 85646
Reputation: 3158
Yeah, you can't do False[1:6]
- False
is a bool
ean, meaning it can only be one of two things (False
or True
)
Just change it to False
and your problem will be solved.
the [1:6]
construct is for working with list
s. So if you had, for example:
theList = [ "a","b","c","d","e","f","g","h","i","j","k","l" ]
print theList # (prints the whole list)
print theList[1] # "b"
print theList[1:6] # ['b', 'c', 'd', 'e', 'f']
In python, this is called "slicing", and can be quite useful.
You can also do things like:
print theList[6:] # everything in the list after "f"
print theList[:6] # everything in the list before "f", but including f
I encourage you to play with this using Jupyter Notebook - and of course, read the documentation
Upvotes: 1
Reputation: 14141
Instead of
top_ten = pd.DataFrame(similarity_matrix.ix[i,]).sort([i],
ascending=False[1:6]).index.values
use
top_ten = pd.DataFrame(similarity_matrix.ix[i,]).sort([i],
ascending=False), [1:6]).index.values
(i. e. insert ),
just after the False
.)
False
is the value of the sort()
method parameter with meaning "not in ascending order", i. e. requiring the descending one. So you need to terminate the sort()
method parameter list with )
and then delimit the 1st parameter of the DataFrame
constructor from the 2nd one with ,
.
[1:6]
is the second parameter of the DataFrame constructor (the index to use for resulting frame)
Upvotes: 5