Reputation: 5475
I'm new to numpy and still can't understand how it handles memory. I know my code is not optimal in terms of memory but couldn't really think of other ways without understanding how the memory is handled. Here's the code:
# X.shape = (700, 23000)
# Y.shape = (700, 23000)
learning_rate=0.7
iters=100
W = np.random.random((Y.shape[0], X.shape[0])) - 0.5
num_examples = Y.shape[1]
for i in range(int(iters)):
error = W.dot(X) - Y
xerror = X.dot(error.T)
d = (1.0 / num_training_examples * xerror).T
W = W - (learning_rate * d)
The code works when the number of training examples is small (~1000), but when it's as big as 20K the memory explodes. Any help to optimise the memory footprint of this code is very much appreciated.
Upvotes: 1
Views: 206
Reputation: 28342
It depends on what you mean by the memory "explodes". I believe numpy uses double for matrices and vectors. That means 8 bytes per number. Therefore a 20k x 20k problem is going to be 400M numbers or 3.2 GB. If that's what you mean then it's just a problem of scale (you have too big of a problem) and you need to find a different way to represent it if such memory uses are too large.
Based on your comment about X and Y being matrices. If your problem size is only 20k ish you can get some savings by only handling a column at a time of X and Y. At 23k rows that will reduce your memory footprint to 1/3. If you scale up to 46k rows that will be up to 2/3 (only a reduction of 1/3) and by 92k rows you'll only reduce to 16/18 (0.888).
Generally when you go to large problems you start dealing with things like iterative algorithms and sparse matrices to reduce the memory load. See for instance conjugate gradient.
Upvotes: 1