Reputation: 19
I am trying to write a TSNE using scikit-learn and scikit-learn’s dataset, but when displaying the result, I want the real MNIST images instead of some colorful dots/plots. I am using matplotlib and seaborn
Here is my code :
import sklearn
import seaborn as sb
import pandas as pd
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import fetch_mldata
mnist = fetch_mldata("MNIST original")
X = mnist.data / 255.0
y = mnist.target
feat_cols = [ 'pixel' + str(i) for i in range(X.shape[1]) ]
df = pd.DataFrame(X,columns=feat_cols)
df['y'] = y
df['label'] = df['y'].apply(lambda i: str(i))
X, y = None, None
np.random.seed(42)
rndperm = np.random.permutation(df.shape[0])
N= 520000
df_subset = df.loc[rndperm[:N],:].copy()
data_subset = df_subset[feat_cols].values
tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)
tsne_results = tsne.fit_transform(data_subset)
df_subset['tsne-2d-one'] = tsne_results[:,0]
df_subset['tsne-2d-two'] = tsne_results[:,1]
plt.figure(figsize=(16,10))
sb.scatterplot(
x="tsne-2d-one", y="tsne-2d-two",
hue="y",
palette=sb.color_palette("hls", 10),
data=df_subset,
legend="full",
alpha=0.3
)
Upvotes: 2
Views: 1674
Reputation: 40687
I don't know if you can make heads or tails of this plot, but if I understood correctly your question, this is what you are trying to do?
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
pixel_cols = df_subset.columns.str.startswith('pixel')
img_w, img_h = 28,28
zoom = 0.5
fig, ax = plt.subplots(figsize=(16,10))
for i,row in df_subset.iterrows():
image = row[pixel_cols].values.astype(float).reshape((img_w, img_h))
im = OffsetImage(image, zoom=zoom)
ab = AnnotationBbox(im, (row["tsne-2d-one"], row["tsne-2d-two"]), xycoords='data', frameon=False)
ax.add_artist(ab)
ax.update_datalim([(row["tsne-2d-one"], row["tsne-2d-two"])])
ax.autoscale()
This code is based on the Annotation Box demo and this answer on SO
Upvotes: 2