Miles Erickson
Miles Erickson

Reputation: 2595

AttributeError: 'numpy.ndarray' object has no attribute 'getA1'

When using pyLDAvis.sklearn.prepare to visualize an LDA topic model, I encountered the following error message:

>>> pyLDAvis.sklearn.prepare(lda_model, dtm, vectorizer)
...
---> 12     return dtm.sum(axis=1).getA1()
...
AttributeError: 'numpy.ndarray' object has no attribute 'getA1'

Passing dtm into pyLDAvis.sklearn.prepare as a pd.DataFrame raises a similar error:

AttributeError: 'Series' object has no attribute 'getA1'

Why is this error message occurring?

Upvotes: 3

Views: 1472

Answers (2)

Wenyue Ma
Wenyue Ma

Reputation: 1

Yes, X = np.matrix(X) is useful for most situations especially here. I was also going through the same issue, but on graph data. And my memory could not support my data as a matrix. I tried scipy sparse array generate by networkx Y = nx.to_scipy_sparse_array(G). This Y type as scipy.sparse._arrays.csr_array. However, fix(X) in sklearn does not support this format. sklearn only suport class 'scipy.sparse._csr.csr_matrix data.

If you want to get rid of generating a full matrix here, the following code maybe helpful.

Y = nx.to_scipy_sparse_array(G)
X = csr_matrix(Y)
sc = SpectralClustering(4, affinity='precomputed', n_init=100)
sc.fit(X)

here sc.fit(X) as an example, it's fits in other sklearn data input operation. In this casesklearn.prepare(lda_model, X, vectorizer)

Upvotes: 0

Miles Erickson
Miles Erickson

Reputation: 2595

The missing getA1 method exists only for numpy.matrix objects. There is no numpy.ndarray.getA1 method, nor is there a pandas.Series.getA1 method.

Casting the document vectors to a numpy.matrix resolves the error:

import pyLDAvis
import pyLDAvis.sklearn
pyLDAvis.enable_notebook()

dtm = np.matrix(document_vectors_arr)
pyLDAvis.sklearn.prepare(lda_model, dtm, vectorizer)

Upvotes: 5

Related Questions