What is the problem of Gaussianmixture in the sklearn package in Python?

Question

I am using Gaussianmixture (GM) of sklearn in python to identify the members of a star cluster. GM adjusted with two components and others are default. As seen in figure, one star (with red dots) seen that is clearly not a cluster member, appears as a member. Red dots gathered in the middle graph are porbably my members. But the single red dot on upper-left of this should not be a member. Because it is not close enough to this middle group.

My cluster image

My python code is

import numpy as np
from numpy import array
import pandas as pd
from sklearn.mixture import GaussianMixture

import matplotlib.pyplot as plt
from matplotlib import style
import matplotlib.colors as mtcolor

style.use("seaborn-white")
clist = ["gray", "red"]
cmap = mtcolor.ListedColormap(clist)

eX = pd.read_csv("mysatrs.csv", usecols=['col1', 'col2', 'col3']).values

col0m = (eX[:,0] >= -5) & (eX[:,0] <= 5)
col1m = (eX[:,1] >= -5) & (eX[:,1] <= 5)
col2m = (eX[:,2] > 0)

X = eX[col0m & col1m & col2m]

plt.figure(figsize=(6,6))

hcgmm = GaussianMixture(n_components=2)
gmmfit = hcgmm.fit(X)
gmmprd = gmmfit.predict(X)
hcprobs = gmmfit.predict_proba(X)
hcmns = hcgmm.means_

plt.scatter(X[:,0], X[:,1], c=gmmprd, s=3, cmap=cmap)
plt.show()

Should another adjustment be do for GM?

What is the problem of Gaussianmixture in the sklearn package in Python?

Answers (1)

Related Questions