Reputation: 3460
I have a function
def get_similar_row(rows, target):
"""Return the index of the most similar row"""
return np.argmax(cosine_similarity(rows, [target]))
get_similar_row([[1191, 3, 0, 1, 1],
[3251, 2, 1, 0, 0],
[1641, 1, 1, 1, 0]], [2133, 3, 0, 0, 1])
Instead of manually inputting numbers while calling the function, I want to pass all rows of my data frame df
such that I skip the id and pass in all other variables for all rows. This is for the rows
parameter of the function.
id size numberOfPlants balcony available publicTransport
0 1191 3 0 1 1
1 3251 2 1 0 0
2 1641 1 1 1 0
3 2133 3 0 0 1
Upvotes: 2
Views: 453
Reputation: 862501
Use DataFrame.drop
for remove id
column, convert to numpy array and pass to function:
#target id
id1 = 3
#convert id to index if necessary
df1 = df.set_index('id')
#selected row by id
target = df1.loc[id1]
#removed target row from original data
get_similar_row(df1.drop(id1).to_numpy(), target.to_numpy())
Upvotes: 2