nick
nick

Reputation: 77

Using toarray() method shows memory error

xtrain,xtest,ytrain,ytest = train_test_split(df_train['clean_comments'],df_train['label'].values,test_size=0.3,shuffle = True)
vectorizer = TfidfVectorizer(strip_accents='unicode',analyzer='word',ngram_range=(1,3),norm='l2')
vectorizer.fit(xtrain)
x_train = vectorizer.transform(xtrain)
x_train = x_train.toarray()

I am trying to convert a sparse array to dense array using toarray() method but it shows memory error. I've already tried todense() method but it didn't work too.

Upvotes: 1

Views: 702

Answers (1)

Stanislas Morbieu
Stanislas Morbieu

Reputation: 1827

Sparse matrices are used to store only the values which are different than zeros in memory and are therefore very well adapted for bag of words matrices. If you try to convert a sparse matrix to a dense format, it consumes a lot more memory since it also stores the zeros. If you don't have enough memory it raises an out of memory error.

Upvotes: 1

Related Questions