faheem paracha
faheem paracha

Reputation: 7

My Image Captioning Model giving me same caption on All images

I am doing a project related to the captioning of Medical Images. I am using the Code from this link.

I am using Indiana Data set of Radiographs and using Findings as Captions for training. I trained successfully and loss value is 0.75. But my final model giving me the same caption for all the images I had checked (Some people also facing the same issue. please check the Comments of this link).

Can you please suggest me any changes in any part of the code or anything else so it starts giving me proper captions for every image I will check.

Upvotes: -1

Views: 1038

Answers (2)

Tdoggo
Tdoggo

Reputation: 419

Make sure you have an equal number of images in each class. If you have 1,000 images in the “pneumonia” category and only 5 in the “broken rib” category, your model will pick the label “pneumonia” almost every time.

Upvotes: 0

Shaunak Sen
Shaunak Sen

Reputation: 588

Looking at the dataset, what I can see is that most of the data is quite similar (black and white images of Chest X rays) - please correct me if I am wrong. So what seems to be happening is that the CNN is learning common features across most of the images. The network is not just deep/advanced enough to pick out the distinguishing patterns. According to the tutorial you are following, I don't think the VGG-16 or 19 network is learning the distinguishing patterns in the images.

The image captioning model will only be as good as the underlying CNN network. If you have a class label field in your data (like the indication/impression field provided here), you can actually confirm this hypothesis by training the network to predict the class of each image and if the performance is poor you can confirm this. If you have the class label, try experimenting with a bunch of CNNs and use the one which achieves the best classification accuracy as the feature extractor.

If you do not have a class label, I would suggest trying out some deeper CNN architectures like Inception or ResNet and see if the performance improves. Hope this was helpful!

Upvotes: 1

Related Questions