Tensorflow: Memory Error while trying to load a numpy sparse matrix to input_fn

Question

I'm building a text classification model and built a large sparse matrix with the shape (81062,100000).

The input_fn function is defined as:

# Define the input function for training
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'tfidf': X_train_tfidf.todense()}, y=y_train.values,
    batch_size=batch_size, num_epochs=None, shuffle=True)

When I tried to execute it, it gives me the following error:

MemoryError                               Traceback (most recent call last)

I then tried to build an input_fn using the data.Dataset module:

def input_fn():
    dataset = tf.contrib.data.Dataset.from_sparse_tensor_slices((X_train_tfidf, y_train.values))
    dataset = dataset.repeat().shuffle(buff).batch(batch_size)
    x, y = dataset.make_one_shot_iterator().get_next()
    return x, y

However, it gives me the following message:

TypeError: `sparse_tensor` must be a `tf.SparseTensor` object.

Basically what I want to do is to feed the training data in smaller batches to a neural network using SGD from a numpy sparse matrix. But I can't find the correct way to do it.

Can someone please help?

Tensorflow: Memory Error while trying to load a numpy sparse matrix to input_fn

Answers (1)

Related Questions