Reputation: 181
I have run the following code for matrices of size 100 million rows and 100 columns.
I am looking to optimize the following code. Any help would be really appreciated.
def matrix_factorization(R, P, Q, K, steps=5000, alpha=0.0002, beta=0.02):
Q = Q.T
for step in range(steps):
e = 0
for i in range(len(R)):
for j in range(len(R[i])):
if R[i][j] > 0:
temp_dot = numpy.dot(P[i,:],Q[:,j])
eij = R[i][j] - temp_dot
e = e + eij*eij
for k in range(K):
P[i][k] = P[i][k] + alpha * (2 * eij * Q[k][j] - beta * P[i][k])
Q[k][j] = Q[k][j] + alpha * (2 * eij * P[i][k] - beta * Q[k][j])
e = e + (beta/2) * ((P[i][k]*P[i][k]) + (Q[k][j]*Q[k][j]))
if e < 0.001:
break
return P, Q.T
Upvotes: 0
Views: 1118
Reputation: 97681
Your inner loop:
for k in range(K):
P[i][k] = P[i][k] + alpha * (2 * eij * Q[k][j] - beta * P[i][k])
Q[k][j] = Q[k][j] + alpha * (2 * eij * P[i][k] - beta * Q[k][j])
e = e + (beta/2) * ((P[i][k]*P[i][k]) + (Q[k][j]*Q[k][j]))
Can be vectorized as:
P[i,:] += alpha * np.sum(2 * eij * Q[:,j] - beta * P[i,:], axis=-1)
Q[:,j] += alpha * np.sum(2 * eij * P[i,:] - beta * Q[:,j], axis=-1)
e += (beta/2) * np.sum(P[i,:]*P[i,:] + Q[:,j]*Q[:,j])
Upvotes: 3