Reputation: 189
I know k-NN is a classification scheme that, for each point you want to classify, takes the k nearest neighbors, then, using some distance, and uses majority vote to classify the point.
Is there a similar regression algorithm where there's only one class, class A. You have a dataset of some (not all) points in the feature space that is in class A. To compute the probability a new point in the feature space is in class A, you look at the density within distance k of that point of class A?
Upvotes: 0
Views: 61
Reputation: 326
I think what you're asking for is probability density estimation. You want to use your observations of the points in class A to empirically construct a probability density function and then use that PDF to predict the probability that a new point is in class A.
There are many ways to implement this, and the simplest might be to model your data as being Gaussian or a mixture of Gaussians. In the case of a single Gaussian model, you can just compute the mean and variance of your dataset to parameterize the Gaussian. This assumes that a Gaussian model is a good fit for your dataset, and if not, there are more sophisticated methods such as making a histogram or using different distributions.
However, if you're only interested in classifying whether or not a point belongs to a class, then estimating the density might be overkill. You could just use any of the numerous classification algorithms out there (Random forests, SVM, logistic regression, neural networks, etc.), and these won't tell you anything about the underlying distribution of the data, but will just give you a classifier.
Upvotes: 1