Mel
Mel

Reputation: 33

logistic regression alternative interpretation

I am trying to analyze the data that shows people catch the disease or not. That is, response is binary. I applied logistic regression. Assume the result of the log.reg (logistic regression) is like;

ID = c(1,2,3,4)
Test_Data = c(0,1,1,0)
Log.Reg_Output = c(0.01,0.4,0.8,0.49)
result = data.frame(ID,Test_Data,Reg_Output)

result

# 1   | 0 |  0.01  
# 2   | 1 |  0.4    
# 3   | 1 |  0.8    
# 4   | 0 |  0.49   

Can I say that person who has ID=3 will catch the disease at 80 percent? Is it right approach? If not, why? I am so confused, any help will be great!

Second question is how can I calculate accuracy rate except rounding the model result 0 or 1. Because rounding 0.49 to 0 is not so meaningful I think. For my example, model output will turn 0,0,1,0 instead 0.01, 0.4, 0.8, 0.49 based on greater or less than 0.5. And accuracy rate will be 75%. Is there any other calculation method?

Thanks!

Upvotes: 3

Views: 133

Answers (1)

desertnaut
desertnaut

Reputation: 60338

Can I say that person who has ID=3 will catch the disease at 80 percent?

It is unclear what you mean by "at"; the traditional/conventional interpretation of the logistic regression output here is the model estimates that person #3 will catch the disease, with 80% confidence. It is also unclear what you mean by "alternative" in your title (you don't elaborate in the question body).

how can I calculate accuracy rate except rounding the model result 0 or 1.

Accuracy by definition requires rounding the model results to 0/1. But, at least in principle, the decision threshold need not necessarily be 0.5...

Because rounding 0.49 to 0 is not so meaningful I think.

Do you think rounding 0.49 to 1 is more meaningful? Because this is the only alternative choice in a binary classification setting (a person either will catch the disease, or not).

Regarding the log loss metric, mentioned in the comments: its role is completely different than that of the accuracy. You may find these relevant answers of mine helpful:

Loss & accuracy - Are these reasonable learning curves?

How does Keras evaluate the accuracy? (despite the mislading title, it has nothing particular to do with Keras).

I seriously suggest you have a look at some logistic regression tutorials (there are literally hundreds out there); a highly recommended source is the textbook An Introduction to Statistical Learning (with Applications in R), made freely available by the authors...

Upvotes: 1

Related Questions