Reputation: 1561
I made a CNN on Keras with Tensorflow backend, my training set has 144 examples, but each example has size of 3200*101. My CNN is very basic, just for learning, batch_size of 2 (I tried reducing it from 32 but nothing improves). One CNN layer, one flatten layer and one dense layer for output (11 classes). When I fit the model, my laptop shows "Allocation of (a big number) exceeds 10 of system memory" and then freezes, even without running 1 epoch. I can't "compress" the examples, each of them must have that size exactly. I am running the model on my CPU (I don't have a GPU), 8 gb ram, 1 tb disk. What can I do?
Psdt: Sorry for any bad english, I am still learning. And thanks for any answer!
Update-Editing: Just adding more information.
My train set is of shape (144, 3400, 101, 1) for examples and for labels is of shape (144,11) My model is like this:
model.add(Conv2D(8, kernel_size=6, activation='linear', input_shape=(3400,101,1), batch_size=2))
model.add(Flatten())
model.add(Dense(11, activation='softmax'))
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100)
Upvotes: 2
Views: 14452
Reputation: 11
sudo swapoff /swapfile
sudo rm /swapfile
sudo fallocate -l 32G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
swapon -s
Above steps make 32G swap space. Then run your deep learning code. Mine works well.
My laptop spec is HP 430 g2 4G ram + 500 G SSD
Upvotes: -1
Reputation: 782
This looks like the same error I am getting as well, when using very long input encoding in a plain feed-forward network in Keras. I have been using word embeddings and there was no issue, but now I am adding extra features in the input and I get the same error as you. You need to enable more memory to be used by your script. What worked for me in kubernetes was to increase the memory in the yml file of my pod:
spec:
containers:
- name: yourname
image: yourimage
command: yourcommand
args: yourargs
resources:
limits:
nvidia.com/gpu: 1 # requesting 1 GPU
memory: 100Gi
It was originally 8G and it worked before I introduced the additional features.
If you don't use docker and K8s, you can do this in your tensorflow session instead with:
config.gpu_options.allow_growth = True
In Keras, I think that would be:
import tensorflow as tf
from keras import backend as K
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.log_device_placement = True
session = tf.Session(config=config)
K.set_session(session)
# do your ML task
K.get_session().close();
It might be that reducing the batch_size
to 1 would solve the issue.
Normally this error is just a warning, and even if the job freezes, if you leave it running, it might finish. If the job is killed though (like mine was), then you definitely need to give more memory to it and a GPU server might be a better idea than your laptop. You can also make sure that you are using float32
and not float64
, because then double the memory is being used. Also, as far as I know normally this error appears with the Adam optimizer, so the fact that you are using SGD means that probably the problem is not in your optimization process.
Upvotes: 0