Daniel Möller
Daniel Möller

Reputation: 86600

Can I train a model in steps in Keras?

I've got a model in Keras that I need to train, but this model invariably blows up my little 8GB memory and freezes my computer.

I've come to the limit of training just one single sample (batch size = 1) and still it blows up.

Please assume my model has no mistakes or bugs and this question is not about "what is wrong with my model". (Yes, smaller models work ok with the same data, but aren't good enough for the task).

How can I split my model in two and train each part separately, but propagating the gradients between them?

Is there a possibility? (There is no limitation about using theano or tensorflow)

Using CPU only, no GPU.

Upvotes: 2

Views: 1082

Answers (1)

Him
Him

Reputation: 5551

You can do this thing, but it will cause your training time to approach sizes that will only make the results useful for future generations.

Let's consider what all we have in our memory when we train with a batch size of 1 (assuming you've only read in that one sample into memory):

1) that sample

2) the weights of your model

3) the activations of each layer #your model stores these for backpropogation

None of this stuff is unnecessary for training. However, you could, theoretically, do a forward pass on the first half of the model, dump the weights and activations to disk, load the second half of the model, do a forward pass on that, then the backward pass on that, dump those weights and activations to disk, load back the weights and activations of the first half, then complete the backward pass on that. This process could be split up even more to the point of doing one layer at a time.

OTOH, this is akin to what swap space does, without you having to think about it. If you want a slightly less optimized version of this (which, optimization is clearly moot at this point), you can just increase your swap space to 500GB and call it a day.

Upvotes: 2

Related Questions