Reputation: 1304
I have a classification problem which tries to predict 6 classes. The input feature is the ECG signal. Here are the labels of the dataset;
anger, calmness, disgust, fear, happiness, sadness
Here is how the dataset looks like;
ecg 0 1 2 3 4 5
0 [[0.1912, 0.3597, 0.3597, 0.3597, 0.3597, 0.35... 1 0 0 0 0 0
1 [[0.2179, 0.4172, 0.4172, 0.4172, 0.4172, 0.41... 1 0 0 0 0 0
2 [[0.1986, 0.3537, 0.3537, 0.3537, 0.3537, 0.35... 0 1 0 0 0 0
3 [[0.2808, 0.5145, 0.5145, 0.5145, 0.5145, 0.51... 0 1 0 0 0 0
4 [[0.1758, 0.2977, 0.2977, 0.2977, 0.2977, 0.29... 0 0 1 0 0 0
5 [[0.2183, 0.396, 0.396, 0.396, 0.396, 0.396, 0... 0 0 1 0 0 0
6 [[0.204, 0.3869, 0.3869, 0.3869, 0.3869, 0.386... 0 0 0 1 0 0
7 [[0.1695, 0.2823, 0.2823, 0.2823, 0.2823, 0.28... 0 0 0 1 0 0
8 [[0.2005, 0.3575, 0.3575, 0.3575, 0.3575, 0.35... 0 0 0 0 1 0
9 [[0.1969, 0.344, 0.344, 0.344, 0.344, 0.344, 0... 0 0 0 0 1 0
10 [[0.2312, 0.4141, 0.4141, 0.4141, 0.4141, 0.41... 0 0 0 0 0 1
11 [[0.1862, 0.3084, 0.3084, 0.3084, 0.3084, 0.30... 0 0 0 0 0 1
12 [[0.2605, 0.47, 0.47, 0.47, 0.47, 0.47, 0.3814... 1 0 0 0 0 0
13 [[0.2154, 0.3733, 0.3733, 0.3733, 0.3733, 0.37... 1 0 0 0 0 0
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
As you can see, I have one-hot encoded my labels.
The problem is, whatever I try, the accuracy is not going above 0.2
and it is always repeating itself on every epoch. The reason why I called this dataset as "non-weighted"
is because there is the same number of instances for each class labels. For example, if there is 60 row of data which is labeled as "anger"
, then there is also 60 "calmness"
, 60 "disgust"
and so on. I thought, it is causing the model to predict always the same class and that is why the accuracy does not change.
Is there any way to solve this problem? Thanks in advance.
Edit: I have tried to convert this classification problem into a "binary classification." I simply eliminated all of the labels and convert them into a single label which is being angry or not angry. In my keras model, I only changed the loss function from "categorical-crossentropy" to "binary-crossentropy". After that, the accuracy of the model dramatically changed and I had above %80 accuracy. So, I don't know what does it mean and what should I understand from this result. But somehow, when there is more than 2 classes in my dataset and it is not a binary classification problem, the accuracy is below %20 and it is repeating itself on every epoch.
Upvotes: 1
Views: 58
Reputation: 2621
Having a balanced dataset, i.e. each class has the same number of samples, is better than an imbalanced one. So I don't think this is the issue.
If you have not shuffle your data in training, you should definitely do it.
If you have done this, I think it will be better for you to check your dataset and network.
For dataset, simply plotting some samples, and see whether you can classify them correctly. If you can't do it, this implies something wrong with your data.
For model, simply run the testing-on-train experiment, which uses a very small number of samples, say 100, and use the exact same training set for testing. The idea is that if your network works, it should quickly overfit to this smaller dataset. Otherwise, this means your network suffers some serious problems.
Some other quick tips:
Upvotes: 1