How can I reduce the memory footprint of this numpy gradient descent code?

Question

I'm new to numpy and still can't understand how it handles memory. I know my code is not optimal in terms of memory but couldn't really think of other ways without understanding how the memory is handled. Here's the code:

# X.shape = (700, 23000)
# Y.shape = (700, 23000)
learning_rate=0.7
iters=100

W = np.random.random((Y.shape[0], X.shape[0])) - 0.5
num_examples = Y.shape[1]

for i in range(int(iters)):
    error = W.dot(X) - Y
    xerror = X.dot(error.T)
    d = (1.0 / num_training_examples * xerror).T
    W = W - (learning_rate * d)

The code works when the number of training examples is small (~1000), but when it's as big as 20K the memory explodes. Any help to optimise the memory footprint of this code is very much appreciated.

How can I reduce the memory footprint of this numpy gradient descent code?

Answers (1)

Related Questions