DonRaHulk
DonRaHulk

Reputation: 635

Tesseract Extract specific information

I would like to scan this image and get only the Name and City information from the image. How should I get this information ?

I am using tesseract 3.02.sample image

There will hundreds of images which I have to process and extract only specific information (say name and city) from it.

Upvotes: 0

Views: 1536

Answers (1)

Brandon A
Brandon A

Reputation: 8287

It would all depend on which Tesseract SDK you are using of course. I used the open source G8Tesseract iOS SDK for a project that did something similar to what you are trying to do. If you are using that framework this may be help. What I recommend is that when you create your G8RecognitionOperation, there is a method you call to retrieve the data called recognitionCompleteBlock. Within the completion block of this method grab the result of the operation and iterate through and parse the data that you'd like. Since you know that the information you want is just after "Name" / right before "Social security", I would slice all the unwanted text before and after that and then dissect from there. Something like this:

 G8RecognitionOperation *operation = [[G8RecognitionOperation alloc] initWithLanguage:@"eng"];
// Set up operation...

operation.recognitionCompleteBlock = ^(G8Tesseract *tesseract) {
    // Fetch the recognized text
    NSString *recognizedText = tesseract.recognizedText;

    NSLog(@"%@", recognizedText);

    // GET NAME
    // Split the result into two strings / Index 0 is trash because it is before Name
    NSArray *slice1 = [recognizedText componentsSeparatedByString:@"Name"];
    NSString *slice1String = slice1[1];

    // What comes before "Social" should be the name you are looking for
    NSArray *slice2 = [slice1String componentsSeparatedByString:@"Social"];
    NSString *name = slice2[0];

    //GET CITY (do the same thing here)
    // Split the rest of the result and get the desired data
    NSArray *slice3 = [slice2[1] componentsSeparatedByString:@"City"];
    NSString *slice3String = slice3[1];

    // What comes before "State" should be the city you are looking for
    NSArray *slice4 = [slice3String componentsSeparatedByString:@"State"];
    NSString *city = slice4[0];

    NSLog(@"Applicant Name: %@ | City: %@",name, city);

};

Upvotes: 1

Related Questions