Objective: Digit recognition by using Neural Networks Description: images are normalized into 8 x 13 pixels . For each row ever black pixel is represented by 1 and every white white 0 . Every image is thus represented by a vector of vectors as follows: Problem: is it possible to use a vector of vectors in Neural Networks? If not how should can the image be represented? Combine rows into 1 vector? Convert every row to its decimal format. Example: Row1: 11111000 = 248 etc.

algorithmmachine-learningneural-networkfeature-extraction

Reputation: 2441

Feature Vector Representation Neural Networks

Objective: Digit recognition by using Neural Networks

Description: images are normalized into 8 x 13 pixels. For each row ever black pixel is represented by 1and every white white 0. Every image is thus represented by a vector of vectors as follows: enter image description here

Problem: is it possible to use a vector of vectors in Neural Networks? If not how should can the image be represented?

Combine rows into 1 vector?
Convert every row to its decimal format. Example: Row1: 11111000 = 248 etc.

Upvotes: 2

Answers (3)

Birol Kuyumcu

Reputation: 1213

1 ) Yes combine into one vector is suitable i use this way http://vimeo.com/52775200

2) No it is not suitable because after normalization from rang ( 0-255 ) -> to range ( 0 - 1 ) differt rows gives aprox same values so lose data

Upvotes: 1

runDOSrun

Reputation: 10995

To use multidimensional input, you'd need multidimensional neurons (which I suppose your formalism doesn't support). Sadly you didn't give any info on your network structure, which i think is your main source of problems an confusion. Whenever you evaluate a feature representation, you need to know how the input layer will be structured: If it's impractical, you probably need a different representation.

Your multidimensional vector: A network that accepts 1 image as input has only 1 (!) input node containing multiple vectors (of rows, respectively). This is the worst possible representation of your data. If we:

flatten the input hierarchy: We get 1 input neuron for every row.
flatten the input hierarchy completely: we get 1 input neuron for every pixel.

Think about all 3 approaches and what it does to your data. The latter approach is almost always as bad as the first approach. Neural networks work best with features. Features are not restructurings of the pixels (your row vectors). They should be META-data you can gain from the pixels: Brightness, locations where we go from back to white, bounding boxes, edges, shapes, masses of gravity, ... there's tons of stuff that can be chosen as features in image processing. You have to think about your problem and choose one (or more).

In the end, when you ask about how to "combine rows into 1 vector": You're just rephrasing "finding a feature vector for the whole image". You definitely don't want to "concatenate" your vectors and feed raw data into the network, you need to find information before you use the network. This is critical for pre-processing.

For further information on which features might be viable for OCR, just read into some papers. The most successful network atm is Convolutional Neural Network. A starting point for the topic feature extraction is here.

Upvotes: 1

Bartek Banachewicz

Reputation: 39390

Combining them into one vector simply by concatenation is certainly possible. In fact, you should notice that arbitrary reordering of the data doesn't change the results, as long as it's consistent between training and classification.

As to your second approach, I think (I am really not sure) you might lose some information that way.

Upvotes: 2

Feature Vector Representation Neural Networks

Answers (3)

Related Questions