Reputation: 23
I'm running a linear discriminant analysis with 2 variables and 2 groups in R, i.e.,
ldares <- lda(dat[,2:3], grouping=dat[,1])
Next, I would like to obtain the formula for the decision bound that separates the groups. I know that I can output the coefficients of the linear discriminant with:
coef(ldares)
However, given that the decision bound is described by:
a*v1 + b*v2 + c = 0,
how do I get the bias or threshold weight c?
Upvotes: 0
Views: 2610
Reputation: 4166
You should realize is that LDA is a linear combination of centered variables. So, the discrimination function is really:
\Sigma [w * (x - mean(x))] > 0
and therefore:
\Sigma [w * x] > \Sigma w * mean(x)
The threshold is therefore \Sigma w * mean(x). Unfortunately, LDA doesn't report mean(x) over the entire dataset, only the two group means. But this allows us to compute the threshold in a rather intuitive way.
Assuming that result is your LDA result, the threshold is mid-way between the response to the centroids of the two classes:
> `sum( result$scaling * result$means[2,] + result$scaling * result$means[1,] )/2`
p.s. Note that in the original question w1*a1 + w2*a2 + c = 0
, the threshold is -c
Upvotes: 0
Reputation: 263301
When no prior weights are given, I believe you will discover that c=0 and that the discriminant scores are based on the distribution of the cases setting the priors. You can see that a score construction with an implicit c=0 assumption produces the expected split in prediction with the iris dataset:
require(MASS)
ldares <- lda(iris[ iris[,5] %in% c("setosa", "versicolor"),2:3],
grouping=iris[iris[,5] %in% c("setosa", "versicolor") ,5])
scores <- with( iris[ iris[,5] %in% c("setosa", "versicolor") , 2:3],
cbind(Sepal.Width, Petal.Length) %*% coef(ldares) )
with( iris[ iris[,5] %in% c("setosa", "versicolor") , c(2:3, 5)],
plot(Sepal.Width, Petal.Length, col=c("black", "red")[1+(scores>0)]) )
Upvotes: 2