Reputation: 21
I am trying to use the KNN package for imputing the missing values I have in my dataframe. My dataframe columns have different ranges i.e some of them are much greater in value than others.
My understanding is that the KNN algorithm uses the Euclidean distance to determine the nearest neighbors. My doubt is if I should normalize the data before feeding it to the algorithm or if it does so by default?
Upvotes: -1
Views: 1492
Reputation: 682
You can see here in the fancyimpute.knn.KNN
class in the code that it takes an attribute normalizer
which can be set to any object with fit()
and transform()
methods.
By default it is set to None
so you'll have to explicitly create a normalizer and feed it to the KNN class object.
Upvotes: 0