How to understand the loss function in scikit-learn logestic regression code?

Question

The code for the loss function in scikit-learn logestic regression is:

# Logistic loss is the negative of the log of the logistic function.
    out = -np.sum(sample_weight * log_logistic(yz)) + .5 * alpha * np.dot(w, w)

However, it seems to be different from common form of the logarithmic loss function, which reads:

-y(log(p)+(1-y)log(1-p))

(please see http://wiki.fast.ai/index.php/Log_Loss)

Could anyone tell me how to understand to code for loss function in scikit-learn logestic regression and what is the relation between it and the general form of the logarithmic loss function?

Thank you in advance.

Meysam Sadeghi · Accepted Answer

First you should note that 0.5 * alpha * np.dot(w, w) is just a normalization. So, sklearn logistic regression reduces to the following

-np.sum(sample_weight * log_logistic(yz))

Also, the np.sum is due to the fact it consider multiple samples, so it again reduces to

sample_weight * log_logistic(yz)

Finally if you read HERE, you note that sample_weight is an optional array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight. So, it should be equal to one (as in the original definition of cross entropy loss we do not consider unequal weight for different samples), hence the loss reduces to:

- log_logistic(yz)

which is equivalent to

- log_logistic(y * np.dot(X, w)).

Now, why it looks different (in essence it is the same) from the cross entropy loss function, i. e.:

- [y log(p) + (1-y) log(1-p))].

The reason is, we can use either of two different labeling conventions for binary classification, either using {0, 1} or {-1, 1}, which results in the two different representations. But they are the same!

More details (on why they are the same) can be found HERE. Note that you should read the response by Manuel Morales.

How to understand the loss function in scikit-learn logestic regression code?

Answers (1)

Related Questions