Reputation: 1
I have been studying "Understanding Machine Learning" book by Shai Shalev-Shwartz and Shai Ben-David and got stuck a bit at section about halfspaces.
The image is a direct page from the book, p. 91.
I have troubles understanding how did eq. 9.1 was derived, namely while is it more or equal to 1. I understand that the y_i * <w,x_i> greater than 0 because both y and dot product must have the same sign, but why does it equal or greater than 1?
Also, I have tried to write an implementation of that linear programming problem on a toy example, and it failed.
My code:
# Create a training set
x1 = np.random.randn(2, 40) * 0.3 + 3
x2 = np.random.randn(2, 40) * 0.3 + 1
y1 = np.ones(40)
y2 = -1 * np.ones(40)
X = np.concatenate((x1,x2), axis = 1)
ones_column = np.ones((1, X.shape[1]))
X = np.vstack((X, ones_column))
Y = np.concatenate((y1,y2))
d = 3
m = 80
v = np.ones(m) # v is m size vector, m = number of samples in training set
A = np.zeros((m,d)) # A is mxd
for i in range(m):
A[i] = Y[i] * X[:,i].T
from scipy.optimize import linprog
c = np.zeros(d)
opt = linprog(c=c, A_ub= -1*A, b_ub=-1*v)
opt.x
array([36.60696741, 21.42292254, 53.65797818])
and the result is not a line that can separate these two clusters. I multiplied matrix A and vector v by -1, because scipy only allows minimization problems. Could you please help me to localize the errors I made?
I also noticed a slightly different initialization of matrix A here
In this site, the matrix A is almost identical, but the last term of each row has opposite sign from the matrix A from the book. What is the right way to initialize matrix A?
Upvotes: 0
Views: 70
Reputation: 1
The problem solved after setting the bounds for LP problem.
opt = linprog(c=c, A_ub=-1*A, b_ub=-1*v, bounds = (-np.inf, np.inf))
The default bounds for scipy.linprog
By default, bounds are (0, None) (all decision variables are non-negative).
The problem arrised because some of the variables had to be negative
Upvotes: 0