Reputation: 601
These oversimplified example target vectors (in my use case each 1
represents a product that a client bought at least once a month)
[1,1,1,0,0,0,0,0,1,0,0,0]
[1,1,1,0,0,0,0,0,0,0,0,0]
[0,1,0,0,0,0,1,0,0,0,0,0]
[1,0,1,0,0,0,0,0,0,0,0,0]
[1,1,1,0,0,0,0,0,1,0,0,0]
[1,1,0,0,0,0,0,0,0,0,0,0]
[1,1,0,0,0,1,0,0,0,0,1,0]
contain labels that are far more sparse than others. This means the target vectors contain some products that are almost always bought and many that are seldomly bought.
In training the ANN (For activation
the input layer uses sigmoid
and the output layer sigmoid
. The lossfct is binary_crossentropy
. What the features to predict the target vector exactly are, is not really relevant here I think.) only learns that putting 1
in the first 3 labels and 0
for the rest is good. I want the model not to learn this pattern, obviously. Also as a side note, I am more interested in true positives in the sparse labels than in the frequent labels. How should I handle this issue?
My only idea would be to exclude the frequent labels in the target vectors entirely, but this would only be my last resort.
Upvotes: 0
Views: 139
Reputation: 43
There are two things I would try in this situation:
But overall, I think regularization would be more effective.
Upvotes: 1