Reputation: 578
I am trying to use k nearest neighbours implementation from scikit learn on a fairly large dataset. The problem is that predictions take a very long time, almost as long as training which doesn't make sense. Is it an issue with the algorithm, or the fact that scikit learn isn't made for large datasets (no GPU support).
For further information, I am trying to predict lidar intensity based on x, y, z and object label. Each lidar scan has ~100,000 points, so I'm trying to predict the intensity for each point.
Upvotes: 3
Views: 6193
Reputation: 11201
Things to try to make scikit-learn's KNeighborsClassifier
run faster:
algorithm
parameter: kd_tree
, ball_tree
for low dimensional data, brute
for high dimensional datan_jobs
parameter. Using a larger n_jobs
doesn't necessarily make things faster, sometimes the opposite.metric="precomputed"
Upvotes: 2