plotting PCA output in scatter plot whilst colouring according to to label python matplotlib

Question

I have just completed a PCA analysis of 14 variables which I have chosen to condense into 2 components.

pca = PCA(n_components=2)
pca.fit(z)
a = pca.fit_transform(z)

The output this gives is in form:

[[ -3.84514275e+00  -1.19829226e-01]
 [ -4.78476227e+00  -1.35986090e-01]
 [ -2.26702900e+00  -1.19665853e+00]
 [ -5.01021616e+00   2.76005130e+00]
 [ -5.57580326e+00  -2.00656680e+00]
 [ -5.08184415e+00  -3.68721491e+00]
 [ -3.41505366e+00  -7.61184868e-01]
 [ -4.92439159e+00  -1.82147509e+00]
...
 [ -3.34931300e+00   7.57884906e-01]]

I want to do the following:

plot each observation on a scattergraph with PC1 (x) being the first value in each array and PC2 (y) being the 2nd value.
colour each observation according to the corresponding label type (i.e. A=red, B=blue, C=green, etc) from the initial pre-PCA data.
label SELECTED (not ALL) observations with the name of the observation from the initial pre-PCA data (i.e. John, Peter, Sally, etc.)

any help is greatly appreciated for any/all of these problems.

Worth noting I attempted to do the scatter by:

plt.scatter(a[1], a[2])
plt.show()

but obviously this doesn't work as my output of a is not seperated by commas and would only plot 2 points. Can't help my head around it so would appreciate SO's input.

EDIT:

dataset in form:

John, A, var1, var2, var3, ..., var14
Peter, A, var1, var2, var3, ..., var14
Sally, B, var1, var2, var3, ..., var14
Cath, C, var1, var2, var3, ..., var14
Jim, A, var1, var2, var3, ..., var14

I'm after something similar to this:

plotting PCA output in scatter plot whilst colouring according to to label python matplotlib

Answers (1)

Related Questions