frank
frank

Reputation: 3608

plot 3d scatter plot from a dataframe and color by group

I am doing a PCA, and have 10 components. I would like to plot the first 3 components and color according to their group type.

from mpl_toolkits.mplot3d import Axes3D

df=pd.DataFrame(np.random.rand(30,20))
grp=round(pd.DataFrame(np.random.rand(30)*10),0)
df['grp']=grp

fig = plt.figure(figsize=(12, 9))
ax = Axes3D(fig)
y = df.iloc[:,1]
x = df.iloc[:,0]
z = df.iloc[:,2]
c = df['grp']
ax.scatter(x,y,z, c=c, cmap='coolwarm')
plt.title('First 3 Principal Components')
ax.set_ylabel('PC2')
ax.set_xlabel('PC1')
ax.set_zlabel('PC3')
plt.legend()

this works, but unfortunately does not show a legend, nor I believe all of the possible groups.

Upvotes: 6

Views: 22428

Answers (1)

CT Zhu
CT Zhu

Reputation: 54340

Check out pandas groupby, grouping you data by groups and plot your groups individually:

Tested in python 3.11.2, pandas 2.0.1, matplotlib 3.7.1

fig = plt.figure(figsize=(12, 9))
ax = fig.add_subplot(projection='3d')

for grp_name, grp_idx in df.groupby('grp').groups.items():
    y = df.iloc[grp_idx,1]
    x = df.iloc[grp_idx,0]
    z = df.iloc[grp_idx,2]
    ax.scatter(x, y, z, label=grp_name)  # this way you can control color/marker/size of each group freely
    ax.scatter(*df.iloc[grp_idx, [0, 1, 2]].T.values, label=grp_name)  # if you want to do everything in one line, lol

ax.legend(bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)
plt.show()

enter image description here

Upvotes: 8

Related Questions