Reputation: 33
I am trying to analyze the data that shows people catch the disease or not. That is, response is binary. I applied logistic regression. Assume the result of the log.reg
(logistic regression) is like;
ID = c(1,2,3,4)
Test_Data = c(0,1,1,0)
Log.Reg_Output = c(0.01,0.4,0.8,0.49)
result = data.frame(ID,Test_Data,Reg_Output)
result
# 1 | 0 | 0.01
# 2 | 1 | 0.4
# 3 | 1 | 0.8
# 4 | 0 | 0.49
Can I say that person who has ID=3 will catch the disease at 80 percent? Is it right approach? If not, why? I am so confused, any help will be great!
Second question is how can I calculate accuracy rate except rounding the model result 0 or 1. Because rounding 0.49 to 0 is not so meaningful I think. For my example, model output will turn 0,0,1,0 instead 0.01, 0.4, 0.8, 0.49 based on greater or less than 0.5. And accuracy rate will be 75%. Is there any other calculation method?
Thanks!
Upvotes: 3
Views: 133
Reputation: 60338
Can I say that person who has ID=3 will catch the disease at 80 percent?
It is unclear what you mean by "at"; the traditional/conventional interpretation of the logistic regression output here is the model estimates that person #3 will catch the disease, with 80% confidence. It is also unclear what you mean by "alternative" in your title (you don't elaborate in the question body).
how can I calculate accuracy rate except rounding the model result 0 or 1.
Accuracy by definition requires rounding the model results to 0/1. But, at least in principle, the decision threshold need not necessarily be 0.5...
Because rounding 0.49 to 0 is not so meaningful I think.
Do you think rounding 0.49 to 1 is more meaningful? Because this is the only alternative choice in a binary classification setting (a person either will catch the disease, or not).
Regarding the log loss metric, mentioned in the comments: its role is completely different than that of the accuracy. You may find these relevant answers of mine helpful:
Loss & accuracy - Are these reasonable learning curves?
How does Keras evaluate the accuracy? (despite the mislading title, it has nothing particular to do with Keras).
I seriously suggest you have a look at some logistic regression tutorials (there are literally hundreds out there); a highly recommended source is the textbook An Introduction to Statistical Learning (with Applications in R), made freely available by the authors...
Upvotes: 1