Chhaganlaal
Chhaganlaal

Reputation: 113

Obtaining wrong error-curve for logistic regression (Bug in Code)

I started machine learning and wrote this code. But for some reason I am getting zig-zag error curve instead of a decreasing logarithmic curve. The "form_binary_classes" for now does nothing but take the start and end indices of two similar datasets with different labels. The error function returns the error in every iteration(most probably this is where the bug is) and acc returns the accuracy. gradient_descent is basically used to return the trained weights and bias terms. Looking only for the bug and not for an efficient method.

def hypothesis(x, theta, b):
    h = np.dot(x, theta) + b
    return sigmoid(h)

def sigmoid(z):
    return 1.0/(1.0+np.exp(-1.0*z))

def error(y_true, x, w, b):
    m = x.shape[0]
    err = 0.0
    for i in range(m):
        hx = hypothesis(x[i], w, b)
        if(hx==0):
            err += (1-y_true[i])*np.log2(1-hx)
        elif(hx==1):
            err += y_true[i]*np.log2(hx)
        else:
            err += y_true[i]*np.log2(hx) + (1-y_true[i])*np.log2(1-hx)
    return -err/m

def get_gradient(y_true, x, w, b):
    grad_w = np.zeros(w.shape)
    grad_b = 0.0
    m = x.shape[0]
    for i in range(m):
        hx = hypothesis(x[i], w, b)
        grad_w += (y_true[i] - hx)*x[i]
        grad_b += (y_true[i] - hx)

    grad_w /= m
    grad_b /= m
    return [grad_w, grad_b]

def gradient_descent(y_true, x, w, b, learning_rate=0.1):
    err = error(y_true, x, w, b)
    grad_w, grad_b = get_gradient(y_true, x, w, b)
    w = w + learning_rate*grad_w
    b = b + learning_rate*grad_b
    return err, w, b

def predict(x,w,b):   
    confidence = hypothesis(x,w,b)
    if confidence<0.5:
        return 0
    else:
        return 1

def get_acc(x_tst,y_tst,w,b):

    y_pred = []

    for i in range(y_tst.shape[0]):
        p = predict(x_tst[i],w,b)
        y_pred.append(p)

    y_pred = np.array(y_pred)

    return  float((y_pred==y_tst).sum())/y_tst.shape[0]

def form_binary_classes(a_start, a_end, b_start, b_end):
    x = np.vstack((X[a_start:a_end], X[b_start:b_end]))
    y = np.hstack((Y[a_start:a_end], Y[b_start:b_end]))
    print("{} {}".format(x.shape,y.shape[0]))
    loss = []
    acc = []
    w = 2*np.random.random((x.shape[1],))
    b = 5*np.random.random()
    for i in range(100):
        l, w, b = gradient_descent(y, x, w, b, learning_rate=0.5)       
        acc.append(get_acc(X_test,Y_test,w))
        loss.append(l)
    plt.plot(loss)
    plt.ylabel("Negative of Log Likelihood")
    plt.xlabel("Time")
    plt.show()

What error plot looks like:

enter image description here

What it SHOULD look like: enter image description here

Upvotes: 0

Views: 134

Answers (1)

alift
alift

Reputation: 1940

You have an issue in computing the error, and that can very possibly cause your model's issue for not converging.

In your code, when you consider the corner cases,if hx==0 or if hx==1 any way the error you are computing is zero, even if we have prediction errors, like hx==0 while ytrue=1

in this case, we come inside the first if, and the error will be (1-1)*log2(1) =0, which is not correct.

You can solve this issue by modifying your first two ifs in this way:

def error(y_true, x, w, b):
    m = x.shape[0]
    err = 0.0
    for i in range(m):
        hx = hypothesis(x[i], w, b)
        if(hx==y_true[i]): #Corner cases where we have zero error
            err += 0
        elif((hx==1 and y_true[i]==0) or (hx==0 and y_true[i]==1) ): #Corner cases where we will have log2 of zero
            err += np.iinfo(np.int32).min # which is an approximation for log2(0), and we penalzie the model at most with the greatest error possible
        else:
            err += y_true[i]*np.log2(hx) + (1-y_true[i])*np.log2(1-hx)
    return -err/m

In this part of the code, I assumed you have binary labels

Upvotes: 1

Related Questions