Ernest Vandenbulcke
Ernest Vandenbulcke

Reputation: 39

I want to read the number of a sudoku board using an OCR with Python

I am trying to make a sudoku-solving app, and I need to to read the sudoku board from an image. I used the PILLOW and the cv2 libraries to divide my sudoku image into small images with each cell, after which I used the pytesseract library to read the number om each cell. I also used grayscale to make it easier to read the numbers, I did everything I could think of, but the pytesseract still has trouble reading all the numbers correctly.

Here are the images I use (original image, cropped image, divided cell, processed cell):
original image

cropped image

divided cell image example

'processed cell image'

Here is the code with the pytesseract, trying to read each number from each cell image:

import pytesseract
from PIL import Image
import cv2
import numpy as np
import os

def preprocess_image(image_path):
    """
    Preprocess the image for better OCR accuracy.
    - Converts to grayscale
    - Applies adaptive thresholding
    - Denoises the image
    - Resizes for better recognition
    """
    img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

    # Apply Gaussian Blur to reduce noise
    img = cv2.GaussianBlur(img, (5, 5), 0)

    # Apply adaptive thresholding for binarization
    img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
                                cv2.THRESH_BINARY, 11, 2)

    # Resize image to make digits clearer
    #img = cv2.resize(img, (100, 100), interpolation=cv2.INTER_CUBIC)

    return img

# Folder where Sudoku cell images are stored
cell_folder = "sudoku_cells"
sudoku_grid = []

for row in range(9):
    sudoku_row = []
    for col in range(9):
        cell_path = os.path.join(cell_folder, f"cell_{row}_{col}.png")

        # Preprocess the cell image
        processed_img = preprocess_image(cell_path)

        # Save processed image (optional, for debugging)
        cv2.imwrite(f"processed_cells/cell_{row}_{col}.png", processed_img)

        # Perform OCR on processed image
        ocr_result = pytesseract.image_to_string(processed_img, config="--psm 10 -c tessedit_char_whitelist=0123456789").strip()

        # Convert OCR output to integer (default to 0 if empty)
        digit = int(ocr_result) if ocr_result.isdigit() else 0
        sudoku_row.append(digit)

    sudoku_grid.append(sudoku_row)

# Print the recognized Sudoku grid
for row in sudoku_grid:
    print(row)

Here is the result I get:

[8, 0, 1, 0, 0, 7, 0, 6, 4]
[0, 0, 0, 0, 1, 3, 0, 0, 2]
[6, 7, 2, 0, 0, 0, 0, 0, 0]
[0, 1, 5, 0, 7, 0, 0, 0, 0]
[0, 0, 2, 0, 2, 0, 0, 1, 0]
[0, 0, 8, 0, 0, 1, 0, 0, 0]
[0, 0, 0, 0, 2, 0, 6, 0, 0]
[2, 0, 6, 0, 0, 5, 0, 9, 4]
[0, 0, 0, 1, 0, 0, 0, 0, 5]

Some numbers still can't be read, and I don't really understand why. What can I do in this situation to make it work? Thank you for any help!

Upvotes: -1

Views: 73

Answers (0)

Related Questions