Mike Ângelo
Mike Ângelo

Reputation: 41

How can I generate Sudoku in Python in less time and more efficiently

My goal here is to fill a grid of N^2*N^2 with positive integers in the range 1 to N^2, satisfying the following rules:

each integer in the range 1 to N^2 occurs only once in each column, row and section N * N.

In order to solve this, i try to make a 3x3 sudoku code that can generate a sudoku with the cells we want filled, however whenever I try to generate the sudoku with all the cells filled, my pc can't handle it, where can I improve this type of algorithm.

import random
 
def MakeSudoku():
    Grid = [[0 for x in range(9)] for y in range(9)]
            
    for i in range(9):
        for j in range(9):
            Grid[i][j] = 0
            
    # The range here is the amount
    # of numbers in the grid
    for i in range(5):
        #choose random numbers
        row = random.randrange(9)
        col = random.randrange(9)
        num = random.randrange(1,10)
        while(not CheckValid(Grid,row,col,num) or Grid[row][col] != 0): #if taken or not valid reroll
            row = random.randrange(9)
            col = random.randrange(9)
            num = random.randrange(1,10)
        Grid[row][col]= num;
        
    Printgrid(Grid)
 
def Printgrid(Grid):
    TableTB = "|--------------------------------|"
    TableMD = "|----------+----------+----------|"
    print(TableTB)
    for x in range(9):
        for y in range(9):
            if ((x == 3 or x == 6) and y == 0):
                print(TableMD)
            if (y == 0 or y == 3 or y== 6):
                print("|", end=" ")
            print(" " + str(Grid[x][y]), end=" ")
            if (y == 8):
                print("|")
    print(TableTB)
#     |-----------------------------|
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     |---------+---------+---------|
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     |---------+---------+---------|
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     | 0  0  0 | 0  0  0 | 0  0  0 |
#     |-----------------------------|
    
def CheckValid(Grid,row,col,num):
    #check if in row
    valid = True
    #check row and column
    for x in range(9):
        if (Grid[x][col] == num):
            valid = False
    for y in range(9):
        if (Grid[row][y] == num):
            valid = False
    rowsection = row // 3
    colsection = col // 3
    for x in range(3):
        for y in range(3):
            #check if section is valid
            if(Grid[rowsection*3 + x][colsection*3 + y] == num):
                valid = False
    return valid
 
MakeSudoku()

Upvotes: 0

Views: 1188

Answers (1)

Hans Musgrave
Hans Musgrave

Reputation: 7121

Part of the problem is or Grid[row][col]!=0 creating an infinite loop when you have a valid grid.

In general though, you're searching way too big of a space. Every valid 9x9 Sudoku has the digits 1-9 once in each row, and for every grid you generate with that property you'll find 1.8e27 which don't. That's just barely more than my potato of a computer can handle.

Most strategies I'm aware of fall back to either a breadth or depth first search of some flavor (start with an empty/partial solution, add to it a bit, go back one or more steps if it leads to a broken grid). I'll do the same here.

Some important points to note include:

  • We place each digit in order (1s, then 2s, ...). Most ways of doing so end in valid Sudokus, whereas especially for larger sudokus building a square at a time usually ends in invalid Sudokus.

  • For each digit we process the rows in order. This isn't mandatory per se, but for the hacky data structure (variable idx) we set up to track Sudoku conflicts it allows us to completely avoid cleanup costs as we backtrack through the implicit graph of potential solutions.

  • Initializing the grid with values of M+2 just cleaned up a tiny bit of logic. I think None is more self-descriptive, but then rather than el>=x you need to actually handle the None case with something like (not el or el >= x).

  • For anything bigger than around 36x36 you'll need a more advanced method of some flavor. Up to 45x45 or so you could probably get away with just using cython or similar to get compiled speed, but beyond that you'll want a better algorithm. IIRC, some aspects of Sudoku are NP-hard; I'm not sure if that's just on the solving/uniqueness side of things or if generating complete boards is also provably difficult. I've heard rumors of people generating 100x100 boards easily though, so regardless I suspect there exist better solutions than what I've posted.

from random import sample, shuffle
from collections import deque

def rand_grid(w, h):
    M = w*h
    grid = [[M+2]*M for _ in range(M)]

    # A stack of the next cells we'll try to fill in.
    # This is sorted by the cell value, then by the row index.
    stack = deque((0, k, 1) for k in sample(range(M), M))

    # We want a cheap/easy way to check if we're about to violate a sudoku
    # rule for a given placement, and we need it to remain cheap upon
    # backtracking. For any cell-value x, idx[x][i]==j means that
    # grid[i][j]==x. Since our stack is processed in row-order we can
    # ignore the tail of any sub-list and never have to clean up after
    # ourselves
    idx = [[None]*M for _ in range(1+M)]
    while True:
        i, j, x = stack.pop()
        idx[x][i] = j

        # We've successfully filled the last row in the grid for a given
        # cell-value x. Clean up and either move to x+1 or return the
        # completely filled grid.
        if i==M-1:
            # If we go forward/backward past this cell-value x we might
            # fill the grid with extra xs. We could handle that by treating
            # idx as the source of truth till we construct the final grid,
            # but we don't backtrack across cell-value changes often and
            # do need to enumerate empty cells in a given row often, so it's
            # cheaper overall to clean up the grid just a little right now.
            # Better data structures probably exist.
            for a,row in enumerate(grid):
                for b,el in enumerate(row):
                    if el==x:
                        row[b] = M+2
            
            # Since we've finished placing cell-value x everywhere, update the
            # grid to reflect that fact
            for a,b in enumerate(idx[x]):
                grid[a][b] = x
            
            # The cell-value we just placed was the last one available. Hooray!
            if x==M:
                return grid
            
            # The next place we'll try to fill is row 0 with cell-value x+1
            else:
                i, x = 0, x+1

        # The next place we'll try to fill is row i+1 with cell-value x
        else:
            i += 1
        
        # At this point i and x represent the row/value we're about to attempt to
        # fill. We construct a set s of the columns which have been filled with x,
        # and a set t of tuples representing the upper-left corner of each box which
        # has been filled with x in order to cheaply check for column/box conflicts.
        s = set(idx[x][:i])
        t = {(w*(a//w), h*(b//h)) for a,b in enumerate(idx[x][:i])}

        # Valid candidates don't conflict with existing smaller digits or with the
        # same digit in any already placed column/box.
        candidates = [(i, k, x) for k,el in enumerate(grid[i])
          if el>=x and k not in s and (w*(i//w), h*(k//h)) not in t]
        shuffle(candidates)
        stack.extend(candidates)

# rand_grid(3, 3) creates a 9x9 Sudoku
# rand_grid(4, 5) creates a 20x20 Sudoku with 4-wide by 5-tall boxes
# rand_grid(n, n) is the MakeSudoku(n) function you asked for

Upvotes: 1

Related Questions