user570593
user570593

Reputation: 3510

bag of words for classification - features vs pixels

I am classifying medical images using bag-of-words model. I did the following to extract the feature vector:

  1. extract features from small image patches and then apply BOW on those features
  2. extract pixel values from small image patches then apply BOW on those pixel values

After the feature extraction I tried PCA, feature selection, changing no of clusters for KMeans etc to improve the accuracy. But in my case BOW learned on pixel values (1) outperforms (90%) than the BOW learned on features(2) (70%). My features are good and when I use those features to classify the images using some other framework I was able to get more than 95% accuracy.

My question is why BOW learned on pixels performs better than BOW learned on features?

enter image description here enter image description here

Normal-abnormal colonoscopy image classification

    Figure 1: a normal colon image
    Figure 2: an image with polyp

Upvotes: 2

Views: 1918

Answers (1)

user1149913
user1149913

Reputation: 4523

My understanding of your two methods for extracting features from an image patch are

Feature selection = "run PCA, k-means, or select some subset of pixels, and construct a vector of these extracted values"

Pixel Values = "create a vector from RGB values of the image"

In fact, to get good results from BOW features, people often derive individual features using relatively complicated algorithms.

In the project at http://vision.stanford.edu/projects/totalscene/index.html (paper in reference #1), the authors take BOW features from both images blocks and a segmentation. For the image blocks, they extract SIFT features, and for each segment they take shape, color, location, and texture features (see section 2.1 and follow the reference for a better description of the features they use).

In "Decomposing a Scene into Geometric and Semantically Consistent Regions." (Gould et. al.) Shape, color, edge, etc. features are derived by doing things like training boosted logistic regression classifiers, Potts models, and Gaussian Mixture models.

You probably don't need such intensive techniques to extract features that beat pixel vectors, but you should definitely browse around the literature to see what is effective.

SIFT features, color histograms, and filters to extract texture responses seem to work pretty well and also have a reasonable amount of software library support.

Upvotes: 3

Related Questions