marina__
marina__

Reputation: 19

Matplotlib scatter different images (MNIST) instead of plots for TSNE

I am trying to write a TSNE using scikit-learn and scikit-learn’s dataset, but when displaying the result, I want the real MNIST images instead of some colorful dots/plots. I am using matplotlib and seaborn

Here is my code :

import sklearn
import seaborn as sb
import pandas as pd
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import fetch_mldata

mnist = fetch_mldata("MNIST original")
X = mnist.data / 255.0
y = mnist.target
feat_cols = [ 'pixel' + str(i) for i in range(X.shape[1]) ]
df = pd.DataFrame(X,columns=feat_cols)
df['y'] = y
df['label'] = df['y'].apply(lambda i: str(i)) 
X, y = None, None
np.random.seed(42) 
rndperm = np.random.permutation(df.shape[0])

N= 520000
df_subset = df.loc[rndperm[:N],:].copy()
data_subset = df_subset[feat_cols].values
tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)
tsne_results = tsne.fit_transform(data_subset)

df_subset['tsne-2d-one'] = tsne_results[:,0]
df_subset['tsne-2d-two'] = tsne_results[:,1]
plt.figure(figsize=(16,10))
sb.scatterplot(
     x="tsne-2d-one", y="tsne-2d-two",
     hue="y",
     palette=sb.color_palette("hls", 10),
     data=df_subset,
     legend="full",
     alpha=0.3
)

Upvotes: 2

Views: 1674

Answers (1)

Diziet Asahi
Diziet Asahi

Reputation: 40687

I don't know if you can make heads or tails of this plot, but if I understood correctly your question, this is what you are trying to do?

from matplotlib.offsetbox import OffsetImage, AnnotationBbox

pixel_cols = df_subset.columns.str.startswith('pixel')
img_w, img_h = 28,28
zoom = 0.5

fig, ax = plt.subplots(figsize=(16,10))
for i,row in df_subset.iterrows():
    image = row[pixel_cols].values.astype(float).reshape((img_w, img_h))
    im = OffsetImage(image, zoom=zoom)
    ab = AnnotationBbox(im, (row["tsne-2d-one"], row["tsne-2d-two"]), xycoords='data', frameon=False)
    ax.add_artist(ab)
    ax.update_datalim([(row["tsne-2d-one"], row["tsne-2d-two"])])
    ax.autoscale()

enter image description here

This code is based on the Annotation Box demo and this answer on SO

Upvotes: 2

Related Questions