galapah
galapah

Reputation: 419

yellowbrick t-SNE fit raises ValueError

I am trying to visualize data with t-SNE from the yellowbrick package. And I am getting an error.

import pandas as pd
from yellowbrick.text import TSNEVisualizer
from sklearn.datasets import make_classification

## produce random data
X, y = make_classification(n_samples=200, n_features=100,
                       n_informative=20, n_redundant=10,
                       n_classes=3, random_state=42)

## visualize data with t-SNE
tsne = TSNEVisualizer()
tsne.fit(X, y)
tsne.poof()

The error (raised by the fit method):

ValueError: The truth value of an array with more than one element
             is ambiguous. Use a.any() or a.all()

Upvotes: 1

Views: 303

Answers (1)

galapah
galapah

Reputation: 419

After some experimenting with the arguments:

tsne.fit(X, y.tolist())

This raises no error, but produces no output.

Finally, replacing with a list of strings works:

y_series = pd.Series(y, dtype="category")
y_series.cat.categories = ["a", "b", "c"]
y_list = y_series.values.tolist()

tsne.fit(X, y_list)
tsne.poof()

The library is intended for analyzing text datasets, perhaps that is why it is not documented that y needs to be strings. Furthermore, the error message is not helpful.

Upvotes: 2

Related Questions