Reputation: 14561
I am currently working with a dataset in which the boundary between classes is not very well defined. I don't want to use regular classification since the nuance overlap of these classes might not be represented using that setup.
I've seen a similar setup in PyTorch where the Binary Cross Entropy Loss function was used, but other than that, I am not sure what needs to be done to reformation my problem from classification to multi label classification in Flux. From the Flux.jl docs, it looks like I may want to use a custom Split label?
Upvotes: 2
Views: 315
Reputation: 6423
This question made me thought on how to implement multi-label classification in BetaML, my own ML library, and it ended up it was relatively easily:
(EDIT: Model simplified by just using a couple of DenseLayer
s with the second layer's activation function f=x -> (tanh(x) + 1)/2
)
using BetaML
# Creating test data..
X = rand(2000,2)
# note that the Y are 0.0/1.0 floats
Y = hcat(round.(tanh.(0.5 .* X[:,1] + 0.8 .* X[:,2])),
round.(tanh.(0.5 .* X[:,1] + 0.3 .* X[:,2])),
round.(tanh.(max.(0.0,-3 .* X[:,1].^2 + 2 * X[:,1] + 0.5 .* X[:,2]))))
# Creating the NN model...
l1 = DenseLayer(2,10,f = relu)
l2 = DenseLayer(10,3,f = x -> (tanh(x) + 1)/2)
mynn = buildNetwork([l1,l2],squaredCost,name="Multinomial multilabel regression Model")
# Train of the model...
train!(mynn,X,Y,epochs=100,batchSize=8)
# Predictions...
ŷ = round.(predict(mynn,X))
(nrec,ncat) = size(Y)
# Just a basic accuracy measure. I could think to extend the ConfusionMatrix measures to multi-label classification if needed..
overallAccuracy = sum(ŷ .== Y)/(nrec*ncat) # 0.999
I initially thought on using softmax with a learnable beta
parameter, but then I realised that such way is not possible: how would the model be able to distinguish between Y = [0 0 0]
and Y = [1 1 1]
? So I ended up with a layer with an adjusted tanh
function that guarantee me an output in the [0,1] range for each label "independently", and setting the threshold on 0.5, the value that maximise the loss (in BetaML the output is already a vector if the last layer has more than a single node).
Upvotes: 2