Reputation: 2595
When using pyLDAvis.sklearn.prepare
to visualize an LDA topic model, I encountered the following error message:
>>> pyLDAvis.sklearn.prepare(lda_model, dtm, vectorizer)
...
---> 12 return dtm.sum(axis=1).getA1()
...
AttributeError: 'numpy.ndarray' object has no attribute 'getA1'
Passing dtm
into pyLDAvis.sklearn.prepare
as a pd.DataFrame
raises a similar error:
AttributeError: 'Series' object has no attribute 'getA1'
Why is this error message occurring?
Upvotes: 3
Views: 1472
Reputation: 1
Yes, X = np.matrix(X)
is useful for most situations especially here.
I was also going through the same issue, but on graph data. And my memory could not support my data as a matrix.
I tried scipy sparse array generate by networkx Y = nx.to_scipy_sparse_array(G)
. This Y
type as scipy.sparse._arrays.csr_array
. However, fix(X) in sklearn
does not support this format. sklearn
only suport class 'scipy.sparse._csr.csr_matrix
data.
If you want to get rid of generating a full matrix here, the following code maybe helpful.
Y = nx.to_scipy_sparse_array(G)
X = csr_matrix(Y)
sc = SpectralClustering(4, affinity='precomputed', n_init=100)
sc.fit(X)
here sc.fit(X)
as an example, it's fits in other sklearn
data input operation. In this casesklearn.prepare(lda_model, X, vectorizer)
Upvotes: 0
Reputation: 2595
The missing getA1
method exists only for numpy.matrix
objects. There is no numpy.ndarray.getA1
method, nor is there a pandas.Series.getA1
method.
Casting the document vectors to a numpy.matrix
resolves the error:
import pyLDAvis
import pyLDAvis.sklearn
pyLDAvis.enable_notebook()
dtm = np.matrix(document_vectors_arr)
pyLDAvis.sklearn.prepare(lda_model, dtm, vectorizer)
Upvotes: 5