Japanese Vertical text recognition with VNRecognizeTextRequest not working

Question

I'm using the Apple OCR capabilities provided by the Vision Framework to recognize text in images. While I've had great success with horizontal text in Japanese, Korean, and Chinese, I'm encountering issues with vertical text.

Problem: When trying to recognize vertical text in these languages, the OCR returns nil.

What I've Tried:

Horizontal Text: Works perfectly for Japanese, Korean, and Chinese.
Vertical Text: Returns nil for the same languages.
Flipping the Image: I attempted to rotate the image to different orientations, but it still returns nil.

images examples

Code Snippet:

  func ocr() {
    
    guard let image = UIImage(named: imageName) else {
        print("Failed to load image")
        return
    }
    
    guard let cgImage = image.cgImage else {
        print("Failed to get CGImage from UIImage")
        return
    }
    
    // Request handler
    let handler = VNImageRequestHandler(cgImage: cgImage, orientation: .right, options: [:])
    
    let recognizeRequest = VNRecognizeTextRequest { (request, error) in
                    
        if let error = error {
            print("Failed to recognize text: \(error.localizedDescription)")
            return
        }
        
        // Parse the results as text
        guard let result = request.results as? [VNRecognizedTextObservation] else {
            print("No text found")
            return
        }
        
        let stringArray = result.compactMap { result in
            result.topCandidates(1).first?.string
        }
        
        
        let recognizedString = stringArray.joined(separator: "\n")
        
        
        let singleLineText = recognizedString
            .components(separatedBy: .newlines)
            .joined(separator: " ")

        
        DispatchQueue.main.async {
            self.recognizeText = singleLineText
        }
    }
    recognizeRequest.recognitionLanguages = ["ja"]

    recognizeRequest.revision = VNRecognizeTextRequestRevision3

    recognizeRequest.automaticallyDetectsLanguage = true
    
    recognizeRequest.recognitionLevel = .accurate
    recognizeRequest.usesLanguageCorrection = false
    
    
    do {
        try handler.perform([recognizeRequest])
    } catch {
        print("Failed to perform text recognition: \(error.localizedDescription)")
    }

}

Basel · Accepted Answer

After trying Apple Vision for 2 weeks, I discovered it does not support vertical text directly. Therefore, I sought alternative solutions and found that the Tesseract OCR library, a well-established tool developed by Google over 20 years ago, could potentially address this issue. Specifically, I found a trained model for vertical Japanese text (jpn_vert.traineddata) in the Tesseract repository.

For iOS, I used the SwiftyTesseract library, which is more modern and worked well for my needs. Below are the steps I followed to get it up and running:

Steps:

Install SwiftyTesseract: Add SwiftyTesseract to your project using Swift Package Manager.
Import SwiftyTesseract
Download jpn_vert.traineddata from here
Add the trained data to your project:

Create a folder named tessdata.
Add jpn_vert.traineddata to this folder.
Drag the tessdata folder to your Xcode project and select Create folder references.
on Edit Scheme then Run inside Environment Variables add name: TESSDATA_PREFIX value: $(PROJECT_DIR)/tessdata

Add This Extention

public typealias PageSegmentationMode = TessPageSegMode

public extension PageSegmentationMode {
  static let osdOnly = PSM_OSD_ONLY
  static let autoOsd = PSM_AUTO_OSD
  static let autoOnly = PSM_AUTO_ONLY
  static let auto = PSM_AUTO
  static let singleColumn = PSM_SINGLE_COLUMN
  static let singleBlockVerticalText = PSM_SINGLE_BLOCK_VERT_TEXT
  static let singleBlock = PSM_SINGLE_BLOCK
  static let singleLine = PSM_SINGLE_LINE
  static let singleWord = PSM_SINGLE_WORD
  static let circleWord = PSM_CIRCLE_WORD
  static let singleCharacter = PSM_SINGLE_CHAR
  static let sparseText = PSM_SPARSE_TEXT
  static let sparseTextOsd = PSM_SPARSE_TEXT_OSD
  static let count = PSM_COUNT
}

public extension Tesseract {
  var pageSegmentationMode: PageSegmentationMode {
    get {
      perform { tessPointer in
        TessBaseAPIGetPageSegMode(tessPointer)
      }
    }
    set {
      perform { tessPointer in
        TessBaseAPISetPageSegMode(tessPointer, newValue)
      }
    }
  }
}

Usage:

 func japaneseOCR() {
    let tesseract = Tesseract(languages: [ .custom("jpn_vert")])
    
    tesseract.pageSegmentationMode = .singleBlockVerticalText
    
    guard let image = UIImage(named: imageName) else {
        print("Failed to load image")
        return
    }
    
    guard let imageData = image.jpegData(compressionQuality: 1.0) else {
        print("Failed to load imageData")
        return
    }

    let result: Result = tesseract.performOCR(on: imageData)

    let result1 = try? result.get()
            
    self.recognizeText = result1 ?? ""
}

Result

Japanese Vertical text recognition with VNRecognizeTextRequest not working

Answers (2)

Related Questions