Reputation: 913
I have a time series data set as a csv file with following columns-
ID,TIMESTAMP,MEASUREMENTS[10]
For an ID multiple measurements are there with the associated timestamp when those measurements were made. The column measurements contains a list of 10 measurements. Measurement in one record(associated with a particular timestamp) somehow depends on the previous record.
eg. of dataset:
ID,TIMESTAMP,MEASUREMENTS
1,0,[123,456,567.....]
1,100,[....]
1,350,[....]
2,0,[....]
2,200,[.....]
Also, the measurement array contains NaNs at some indexes. Finally, I have some label associated with each ID which is the outcome of the measurements performed till the very last timestamp for that id. My Objective is to fit this data into an HMM model and then Predict the label for the test dataset which is in the same format. How to fit this model into a HMM Model from sklearn/hmmlearn? sklearns documentation is not up to the mark for the model, no parameters are explained.
Upvotes: 4
Views: 5476
Reputation: 1305
Since your problem require predicting a label for a squence. You should use seqlearn which is a sequence classification tool.
Also, fitting the data in an HMM would require some pre processing since it accepts a list of arrays. You could concatenate time stamp and the three measurements associated with each id in an ascending order with respect to time. this would give you a sequence of length 33 for each ID.
Let me know if you require further help. I recently used HMMLearn for a project.
Upvotes: 2