inaMinute
inaMinute

Reputation: 585

Why I get different values everytime I run the function hmmlearn.hmm.GaussianHMM.fit()

I have a program.

n = 6 
data=pd.read_csv('11.csv',index_col='datetime')
volume = data['TotalVolumeTraded']



close = data['ClosingPx']
logDel = np.log(np.array(data['HighPx'])) - np.log(np.array(data['LowPx']))
logRet_1 = np.array(np.diff(np.log(close)))
logRet_5 = np.log(np.array(close[5:])) - np.log(np.array(close[:-5]))
logVol_5 = np.log(np.array(volume[5:])) - np.log(np.array(volume[:-5]))
logDel = logDel[5:]
logRet_1 = logRet_1[4:]
close = close[5:]
Date = pd.to_datetime(data.index[5:])
A = np.column_stack([logDel,logRet_5,logVol_5])


model = GaussianHMM(n_components= n, covariance_type="full", n_iter=2000).fit([A])
hidden_states = model.predict(A)

I run the code the first time ,the value of "hidden_states" is as follow,

enter image description here

I run the code the second time ,the value of "hidden_states" is as follow, enter image description here

Why are two values "hidden_states" different?

Upvotes: 0

Views: 642

Answers (2)

mrt
mrt

Reputation: 339

Try to control the randomness by setting the seed and the random_state when you define your model. Moreover you could initialize the startprob_ and the transmat_ and see how it behaves.

That way you might have a better explanation about the cause of this behavior.

Upvotes: 0

Sergei Lebedev
Sergei Lebedev

Reputation: 2679

I am not completely sure what happens here, but here're two possible explanations for the results you're seeing.

  1. The model does not maintain any ordering over state labels. So state labelled as 1 in one run could end up being 4 in another run. This is known as label switching problem in latent variable models.
  2. GaussianHMM initializes emission parameters via k-means which might converge to different values depending on the data. The initial parameters are passed to the EM-algorithm which is also prone to local maxima. Therefore different runs could result in different parameter estimates and (as a result) slightly different predictions.

Upvotes: 1

Related Questions