Reputation: 585
I have a program.
n = 6
data=pd.read_csv('11.csv',index_col='datetime')
volume = data['TotalVolumeTraded']
close = data['ClosingPx']
logDel = np.log(np.array(data['HighPx'])) - np.log(np.array(data['LowPx']))
logRet_1 = np.array(np.diff(np.log(close)))
logRet_5 = np.log(np.array(close[5:])) - np.log(np.array(close[:-5]))
logVol_5 = np.log(np.array(volume[5:])) - np.log(np.array(volume[:-5]))
logDel = logDel[5:]
logRet_1 = logRet_1[4:]
close = close[5:]
Date = pd.to_datetime(data.index[5:])
A = np.column_stack([logDel,logRet_5,logVol_5])
model = GaussianHMM(n_components= n, covariance_type="full", n_iter=2000).fit([A])
hidden_states = model.predict(A)
I run the code the first time ,the value of "hidden_states" is as follow,
I run the code the second time ,the value of "hidden_states" is as follow,
Why are two values "hidden_states" different?
Upvotes: 0
Views: 642
Reputation: 339
Try to control the randomness by setting the seed
and the random_state
when you define your model. Moreover you could initialize the startprob_
and the transmat_
and see how it behaves.
That way you might have a better explanation about the cause of this behavior.
Upvotes: 0
Reputation: 2679
I am not completely sure what happens here, but here're two possible explanations for the results you're seeing.
GaussianHMM
initializes emission parameters via k-means which might converge to different values depending on the data. The initial parameters are passed to the EM-algorithm which is also prone to local maxima. Therefore different runs could result in different parameter estimates and (as a result) slightly different predictions.Upvotes: 1