HMM - general concept and strategy

Question

I am new to data science and am trying to deal with an HMM on time series - I ran into some problems and I feel I have to rethink my stategy in general. I would love to get some feedback on my thoughts!

My data: I have time series data with a sequence of measurements each hour. The data have a strong daily rhythm - low values in the night and a peak each day in the afternoon. In addition, I have additional information on the temp, rain etc. at the timepoints.

My goal: To train an HMM on the sequence and to generate synthetic data from the model. It is important, that the circadian rhythm in the original data is also captured by the synthetic data.

What I did so far: I extracted features from the time series (encoded the days as cosinus and sinus waves) and trained a multivariate HMM using the python library pomegranate. Mapping the states on the original time series looks good - some states are characteristic for the nights, some for the morning/afternoon and so on. I also generated sample data from that model.

My problem: The generated synthetic sample data do not show the typcial (and important) daily rhythm/pattern.

My understanding of the problem: An HMM is characterized by the initial state, the emission probability and the transition probability between the hidden states. Each stated n+1 only depends on the state of n. So - trying to nail the problem in very simple words - the model is not able to capture the 24 hour rhythm because it only relies on the ONE previous state.

My thoughts on possible improvements:

Instead of training a sequence of several years, I could slice the time series and retrain the model on hundreds of days. Then, I would generate days and add them to a sequence of the length needed.

Semi hidden markov model: I read that these exist to make sure, that one state can last over several time points. As I understand, this would not help to keep the 24 rhythm.

Harmonic HMM: I saw this in a paper (https://royalsocietypublishing.org/doi/10.1098/rsif.2017.0885). I don´t understand how it would be implemented… but as I understand, the purpose of it was to include a circadian oscillator in the transition matrix.

My question: Could you please give some feedback on how I could proceed? I would greatly appreciate any thoughts, ideas, explanations. Switching to another model is not an option - this problem is part of a bigger project where different approaches should be compared in the end. So my priority is to come up with the best solution that one can get using HMMs.

Thank you very much in advance!!

HMM - general concept and strategy

Answers (1)

Code

References

Related Questions