Reputation: 31
I am using Mixture of Hidden Markov Model (MHMM) to cluster my data. To do so, I used Package "seqHMM" in R. My question is whether it is possible to obtain the actual observations within each cluster.
Like for example, after my analysis, I have 3 clusters, and I want to find the exact observations within each cluster, is it possible?
Example:
At first, I created three HMMs with transition probabilities initial probabilities sc_init1
, sc_init2
, sc_init3
, and sc_trans1
, sc_trans2
, sc_trans3
, and finally with emission probabilities sc_emiss1
, sc_emiss2
, sc_emiss3
respectively. Then I combined them into MHMM with three clusters as the following:
mhmm_init <- list(sc_init1, sc_init2, sc_init3)
mhmm_trans <- list(sc_trans1, sc_trans2, sc_trans3)
mhmm_emiss <- list(sc_emiss1,sc_emiss2, sc_emiss3)
mhmm<- build_mhmm(observations=seq, transition_probs=mhmm_trans, emission_probs=mhmm_emission, initial_probs=mhmm_initial, cluster_names = c("Cluster 1", "Cluster 2", "Cluster 3”))
My data, seq
, is longitudinal data. Now that the model is constructed, I estimated model parameters with the fit_model
function as the following
set.seed(1011) #1011
mhmm_fit <- fit_model(mhmm, local_step = TRUE, threads = 1,
control_em = list(restart = list(times =10)))
mhmm_final <- mhmm_fit$model
By using mhmm_final
, I can get several information about each of my three clusters such as transition probabilities, initial probabilities and emission probabilities. For example, if I want to get these estimations for cluster 1 I can easily get them with the following code:
mhmm_final$transition_probs$`Cluster 1`
mhmm_final$emission_probs$`Cluster 1`
mhmm_final$initial_probs$`Cluster 1`
My question is that how I can get observations in each cluster. There is a code available for observations as mhmm_final$observations
but this line of code gives me all the observations in all three clusters. I want to find the exact observations within each cluster, in this case Cluster 1.
Let’s assume that I have 10 sequences (seq 1, seq 2, seq 3, seq 4, seq 5, seq 6, seq 7, seq 8, seq 9, seq 10), and I clustered them into three groups with this approach. I want to know that each of these sequences belongs to which cluster.
Upvotes: 3
Views: 444
Reputation: 43
you can get the most probable clusters from the summary:
summary(mhmm_final)$most_probable_cluster
Upvotes: 4