Note: This is an update to the earlier post to include the query to actually display the solved puzzle (for the skeptics… )
Recently I had to make a short trip and took a look at a Sudoku puzzle in the airline magazine. In the past, I’ve posted some about automating solving problems and just finished a PhD in this area that outlined an automated technique for algorithm discovery through analyzing solution sequences from simulations.
What intrigued me about the Sudoku scenario is that it lends itself to a Cartesian product solution approach – i.e. rows and columns and groupings must all contain specific values. Based on that, it should be solvable with relational algebra. I modeled the data structure on the plane to see how hard it would be to arrive to a solution using TSQL for a puzzle. Before spending too much time on this, I searched the Internet and found that somebody else had already modeled and solved this. Samuel Aina actually undertook this scenario back in 2010 – the link to his approach is at http://www.developerfusion.com/article/84374/solvingsudokuwithsql/.
In keeping with my generic problem model approach, I still thought it would be worthwhile to do this via a less procedural approach than that of Aina. My approach for generic problem solving relies on a schema that defines the start, transition, and final states for any problem solving scenario. In keeping with that approach, I decided to try to solve the problem based on a view that would present to a solver procedure to enumerate through the steps.
For testing purposes, I went to http://www.websudoku.com/ and pulled in the first puzzle that came up shown below:
The schema and execution script is at the end of this post. Below are the general steps for accomplishing the solution.
1) Define a table for the Sudoku cell with the row, column, and section row/column. I also added an absolute row/column/section number to simplify the queries to allow the sums to go across the sections (section being the 3 x 3 areas that need to have all number from 1 to 9) and provide the values associated in a row or column across the whole puzzle. I also created a table to record the solution steps and a trigger on the cell table to write to this so that the solution generated by a solver procedure can be replayed and verified.
2) Create a generic numbers table with values 1 to 9. This is used to generate possible values as well as populate the initial rows in the Soduku cell table.
3) Populate the Cell table – I did a stored procedure that receives the absolute row, column, and value and updates the appropriate row in the table.
4) Create the views that return the values not yet used in either a row, column, or section
5) Create a view that returns all possible moves by joining the values via absolute row and column across the possible value views from row, column, section. This provides all the legal moves
6) Create a view that returns the required next move – This uses a having clause to look for the one distinct row, column that has only one valid move. If there is not a possible path with only one row, column multiple instances are required to pursue possible solution paths. This adds some complexity, but not a huge amount. This issue is explained further down in the post (Note).
7) Create a stored procedure that looks for the row/column with only one solution path and update the value. This results in further solutions to materialize for the next step – as each number is filled in the puzzle, it leads to other row/columns that can be filled in based on the result.
8) Execute the stored procedure to solve the puzzle – this results in the moves being recorded into a solver table for verification purposes using the trigger outlined in step 1.
Note: In my test case I was fortunate that there was at least one cell meeting the criteria of having only one possible solution as I was using an easy puzzle. This simplifies the solution so that I did not need to try multiple paths. For a more complex puzzle, I would have had to generate multiple instances for whatever row/column had the least number of paths. This is the design pattern of my generic simulator that I utilized to solve Tower of Hanoi where there were multiple solution paths. Aina’s solution does handle this scenario whereas mine does not yet.
Here are the detailed steps in SQL:
1) Create tables and trigger (for more complex scenario, an instance id would be needed to generate multiple solution paths for the same game configuration)
CREATE TABLE [Sudoku].[Cell](
[CellId] [int] IDENTITY(1,1) NOT NULL,
[GameId] [int] NOT NULL,
[Row] [int] NOT NULL,
[Col] [int] NOT NULL,
[Val] [int] NULL,
[SectionRow] [int] NOT NULL,
[SectionCol] [int] NOT NULL,
[AbsoluteRow] AS ([Row]+([SectionRow](1))*(3)),
[AbsoluteCol] AS ([Col]+([SectionCol](1))*(3)),
[SectionNumber] AS (([SectionRow](1))*(3)+[SectionCol]),
[MoveNumber] [int] NULL,
CONSTRAINT [PK__Cell__EA424A08D48B6731] PRIMARY KEY CLUSTERED
(
[CellId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [Sudoku].[Cell] ADD CONSTRAINT [DF__Cell__GameId__24927208] DEFAULT ((0)) FOR [GameId]
GO
ALTER TABLE [Sudoku].[Cell] ADD CONSTRAINT [DF_Cell_MoveNumber] DEFAULT ((0)) FOR [MoveNumber]
GO
CREATE TABLE [Sudoku].[SolveLog](
[StepId] [int] IDENTITY(1,1) NOT NULL,
[CellId] [int] NOT NULL,
[MoveDateTime] [datetime] NULL,
[Val] [int] NOT NULL,
[MoveNumber] [int] NOT NULL,
CONSTRAINT [PK__SolveLog__2434335717D3913E] PRIMARY KEY CLUSTERED
(
[StepId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TRIGGER [Sudoku].[Cell_Utrig]
ON [Sudoku].[Cell]
AFTER UPDATE
AS
BEGIN
— SET NOCOUNT ON added to prevent extra result sets from
— interfering with SELECT statements.
SET NOCOUNT ON;
INSERT [Sudoku].[SolveLog]
(CellId, Val, MoveNumber)
SELECT CellId, Val, MoveNumber
FROM inserted
— Insert statements for trigger here
END
GO
2) Create generic number table and populate it
CREATE TABLE [dbo].[Numbers](
[Number] [int] NULL
) ON [PRIMARY]
GO
insert dbo.Numbers select 1
insert dbo.Numbers select 2
insert dbo.Numbers select 3
insert dbo.Numbers select 4
insert dbo.Numbers select 5
insert dbo.Numbers select 6
insert dbo.Numbers select 7
insert dbo.Numbers select 8
insert dbo.Numbers select 9
3) Insert rows for a game for all rows/columns and create stored procedure to populate puzzle data and execute
insert into Sudoku.cell
(GameId, Row, Col, SectionRow, SectionCol)
select 1, r.Number, c.Number, gc.Number, gr.Number
from dbo.Numbers r
cross join dbo.Numbers c
cross join dbo.Numbers gr
cross join dbo.Numbers gc
where r.Number <= 3
and c.Number <= 3
and gr.Number <= 3
and gc.Number <= 3
CREATE PROCEDURE [Sudoku].[UpdateCell_ByAbsolutePosition]
@GameId int,
@AbsRow int,
@AbsCol int,
@Val int,
@MoveNumber int = 0
AS BEGIN
UPDATE c
SET Val = @Val,
MoveNumber = @MoveNumber
FROM [Sudoku].[Cell] c
WHERE c.AbsoluteRow = @AbsRow
AND c.AbsoluteCol = @AbsCol
END
GO
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 1, @AbsCol = 6, @Val = 8
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 1, @AbsCol = 7, @Val = 4
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 1, @AbsCol = 8, @Val = 7
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 2, @AbsCol = 1, @Val = 9
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 2, @AbsCol = 3, @Val = 8
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 2, @AbsCol = 4, @Val = 4
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 2, @AbsCol = 5, @Val = 5
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 3, @AbsCol = 1, @Val = 6
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 3, @AbsCol = 5, @Val = 9
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 3, @AbsCol = 6, @Val = 1
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 3, @AbsCol = 7, @Val = 5
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 3, @AbsCol = 8, @Val = 3
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 4, @AbsCol = 1, @Val = 2
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 4, @AbsCol = 2, @Val = 8
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 4, @AbsCol = 7, @Val = 3
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 4, @AbsCol = 9, @Val = 4
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 5, @AbsCol = 3, @Val = 9
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 5, @AbsCol = 7, @Val = 7
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 6, @AbsCol = 1, @Val = 7
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 6, @AbsCol = 3, @Val = 5
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 6, @AbsCol = 8, @Val = 2
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 6, @AbsCol = 9, @Val = 6
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 7, @AbsCol = 2, @Val = 1
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 7, @AbsCol = 3, @Val = 4
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 7, @AbsCol = 4, @Val = 5
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 7, @AbsCol = 5, @Val = 6
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 7, @AbsCol = 9, @Val = 7
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 8, @AbsCol = 5, @Val = 4
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 8, @AbsCol = 6, @Val = 7
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 8, @AbsCol = 7, @Val = 2
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 8, @AbsCol = 9, @Val = 5
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 9, @AbsCol = 2, @Val = 2
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 9, @AbsCol = 3, @Val = 7
EXEC [Sudoku].[UpdateCell_ByAbsolutePosition] @GameId = 1, @AbsRow = 9, @AbsCol = 4, @Val = 1
4) Create the views that return the candidate values for row, column, and section
CREATE VIEW Sudoku.v_Candidate_RowVal
AS SELECT r.Number AS Row, v.Number as Val — select *
from dbo.Numbers r
INNER JOIN dbo.Numbers v
ON v.Number NOT IN (SELECT c2.Val from Sudoku.Cell c2 WHERE c2.AbsoluteRow = r.Number AND c2.Val IS NOT NULL)
CREATE VIEW Sudoku.v_Candidate_ColVal
AS SELECT r.Number AS Col, v.Number as Val — select *
from dbo.Numbers r
INNER JOIN dbo.Numbers v
ON v.Number NOT IN (SELECT c2.Val from Sudoku.Cell c2 WHERE c2.AbsoluteCol = r.Number AND c2.Val IS NOT NULL)
CREATE VIEW Sudoku.v_Candidate_SectionVal
AS SELECT r.Number AS Section, v.Number as Val — select *
from dbo.Numbers r
INNER JOIN dbo.Numbers v
ON v.Number NOT IN (SELECT c2.Val from Sudoku.Cell c2 WHERE c2.SectionNumber = r.Number AND c2.Val IS NOT NULL)
5) Possible move view
CREATE VIEW Sudoku.v_PossibleMove AS
SELECT c.AbsoluteRow, c.AbsoluteCol, r.Val
FROM Sudoku.Cell c
INNER JOIN Sudoku.v_Candidate_RowVal r on r.Row = c.AbsoluteRow
INNER JOIN Sudoku.v_Candidate_ColVal cv on cv.Col = c.AbsoluteCol and cv.Val = r.Val
INNER JOIN Sudoku.v_Candidate_SectionVal s on s.Section = c.SectionNumber and s.Val = cv.Val
WHERE c.Val IS NULL
6) Next definite view (For more complex games, would need to be modified to select the one with the smallest count rather than count = 1
CREATE VIEW Sudoku.v_NextMove AS
SELECT c.AbsoluteRow, c.AbsoluteCol, max(c.Val) AS Val FROM Sudoku.v_PossibleMove c
GROUP BY c.AbsoluteRow, c.AbsoluteCol having count(*) = 1
7) Stored procedure that solves the puzzle
CREATE PROCEDURE [Sudoku].[Solve] AS
BEGIN
DECLARE @MoveNumber INT = 1
WHILE EXISTS (SELECT 0 FROM Sudoku.Cell c WHERE c.val is null)
BEGIN
UPDATE c
SET c.Val = m.Val,
MoveNumber = @MoveNumber
FROM Sudoku.Cell c
INNER JOIN Sudoku.v_NextMove m
ON m.AbsoluteRow = c.AbsoluteRow
AND m.AbsoluteCol = c.AbsoluteCol
WHERE c.Val IS NULL
SET @MoveNumber = @MoveNumber + 1
END
END
GO
8) Execute stored procedure and verify results – Note that multiple cells are solved simultaneously on some of the intermediate steps. Even though 47 cells need to be solved, the total number of steps required is actually only 13
EXEC Sudoku.Solve
(1 row(s) affected)
(2 row(s) affected)
(2 row(s) affected)
(2 row(s) affected)
(5 row(s) affected)
(3 row(s) affected)
(4 row(s) affected)
(8 row(s) affected)
(6 row(s) affected)
(6 row(s) affected)
(4 row(s) affected)
(3 row(s) affected)
(1 row(s) affected)
SELECT c.[Val]
,c.AbsoluteRow
,c.AbsoluteCol
,c.[MoveNumber]
FROM [Games].[Sudoku].[SolveLog] s
inner join Sudoku.cell c
on c.CellId = s.CellId
where s.MoveNumber > 0
order by movenumber
Val

AbsoluteRow

AbsoluteCol

MoveNumber

2

3

3

1

8

3

9

2

7

3

4

2

1

5

9

3

4

3

2

3

2

2

9

4

3

6

2

4

9

1

9

5

6

5

2

5

4

5

1

5

7

2

2

5

5

1

2

5

3

9

9

6

9

8

2

6

1

4

3

6

9

9

6

7

8

9

5

7

7

4

5

7

3

1

3

7

6

9

7

8

3

8

4

8

4

6

6

8

1

6

5

8

2

1

5

8

5

9

1

8

6

8

3

8

1

1

1

8

4

9

8

9

1

2

7

9

2

7

6

9

3

5

5

9

6

1

4

9

8

8

1

9

1

8

8

10

6

2

8

10

5

5

6

10

9

4

4

10

3

2

6

10

3

7

1

10

8

5

8

11

5

4

8

11

8

6

4

11

6

4

6

11

9

7

8

12

9

6

7

12

2

5

4

12

8

7

7

13

This query outputs the solved puzzle results:
select [1],[2],[3],[4],[5],[6],[7],[8],[9]
from (select AbsoluteRow, AbsoluteCol, val from Sudoku.Cell) as s
pivot(Avg(Val) for AbsoluteCol in ([1],[2],[3],[4],[5],[6],[7],[8],[9]))
as pvt
order by AbsoluteRow
And here are the results:
1

5

3

6

2

8

4

7

9

9

7

8

4

5

3

1

6

2

6

4

2

7

9

1

5

3

8

2

8

1

9

7

6

3

5

4

4

6

9

2

3

5

7

8

1

7

3

5

8

1

4

9

2

6

3

1

4

5

6

2

8

9

7

8

9

6

3

4

7

2

1

5

5

2

7

1

8

9

6

4

3

Voila!