Reputation: 651
The question is very simple but don't know how to implement it in practice. I would like to train a tensorflow LSTM model with the dataset, which is incredible large (50 millions records). I am able to load the data file to a local machine but the machine crash during the pre-processing stage due to limited memory. I have tried to del un-used files and garbage collection to free the memory but it does not help.
Is there any way, I can train a tensorflow model separately for example, the model will be train 5 times, each time only use 10 million records and then delete 10 million records after training to free the memory ram. The same procedure will be repeated for 5 times to train a tensorflow model.
Thanks
Upvotes: 0
Views: 521
Reputation: 547
There are some ways to avoid these problems:
1- You can use google colab and high-RAM in runtime or any other Rent a VM in the cloud.
2- The three basic software techniques for handling too much data: compression, chunking, and indexing.
Upvotes: 2