Introduction

This is a sequel to the previous one (I tried to solve Sudoku with optuna).

Last time, without knowing even the rules of Sudoku, I tried to find the correct answer by optimizing with optuna. This time, I tried to optimize it for optuna after knowing the rule that there is no duplication of numbers in vertical, horizontal, and 3x3 blocks.

I was planning to post it earlier, but the Ring Fit Adventure I started after the last post was too hard and I lost both energy and physical strength, so it was late.

Method

This time, we will solve it by the annealing method.

The annealing method changes the current value slightly randomly to create the next value. (The first time there is no original value, so make it completely random.) If the cost calculated from the new value you created decreases, move to the new value. Even if it does not decrease, it will move or not with a probability. By gradually reducing the probability, it converges to the local solution.

Implementation

The method of annealing with optuna can be found in the Official Documents. I just changed the part that samples the new value, and the rest is as per the official documentation. The previous code remains, except that the sampler has been changed.

Only the part that samples the new value is shown here. The entire code is shown at the end.

params = {}
#Randomize the sampling order
name_list = shuffle(list(search_space))
for param_name in name_list:
    #Two-dimensional coordinates(i, j)
    i, j = int(param_name[1]), int(param_name[2])
    #In advance(i, j)I made a mask to take out the elements of the vertical, horizontal, and 3x3 blocks.
    #However,(i, j)Elements are not taken out with a mask
    mask = self._mask_array[i, j]
    tmp = self._state * mask
    # 1~Count how many of each of the 9 elements are
    cost = np.asarray([np.count_nonzero(tmp == v) for v in range(1, 10)])
    probability = softmax(-cost * 5)
         
    #Sampling new values
    new_value = np.random.choice(9, p=probability) + 1
    #Record new value
    params[param_name] = new_value
    self._state[i, j] = new_value

return params

The new values are more likely to be sampled based on the rule that there are no duplicate numbers in vertical, horizontal, and 3x3 blocks.

We are using our knowledge of rules here.

Experimental result

The numbers with * on the left are originally given as hints.

*5*3 2 4*7 6 9 1 8
*6 7 4*1*9*5 3 5 2
 1*9*8 2 3 8 4*6 7
*8 1 9 5*6 4 7 2*3
*4 2 6*8 7*3 5 9*1
*7 5 3 9*2 1 8 4*6
 9*6 1 3 5 7*2*8 4
 2 8 7*4*1*9 6 3*5
 3 4 5 6*8 2 1*7*9
loss: 4.0

I'm sorry.

The number of trials reached this answer for the 115th time. After this, I did it up to 200 times, but it didn't work.

You can't tell at a glance what's wrong, but there's a vertical 7 overlap in the center. The 7 in the center is wrong in the vertical direction, but it meets the conditions in the horizontal direction. Therefore, the correct answer cannot be reached unless multiple elements change at the same time, which is a fairly deep local solution. (The correct answer cannot be reached without going through a state where the cost is quite high)

Summary

It has become a local solution. It seems difficult with the current method because multiple factors must change at the same time in order to reach the correct answer. If you change the initial value and try again many times, you should be able to get the correct answer.

Future plans

This time there was a vertical overlap of 7s in the center, but 7s shouldn't have been sampled as they were originally given. (It is related to the condition of how much knowledge about rules is used) In some cases, the value is determined from the numbers originally given, such as the third column from the right and the third row from the top.

These were not considered at all in this method, so I would like to solve this next time. (I don't think oputuna is relevant anymore)

code

from itertools import product

import numpy as np
import optuna
import click
from scipy.special import softmax
from sklearn.utils import shuffle


# https://optuna.readthedocs.io/en/stable/tutorial/sampler.html
class SimulatedAnnealingSampler(optuna.samplers.BaseSampler):
    def __init__(self, temperature=100):
        self._rng = np.random.RandomState()
        self._temperature = temperature
        self._current_trial = None
        
        self._state = None
        self._mask_array = make_mask_array()
        
    def sample_relative(self, study, trial, search_space):
        if search_space == {}:
            return {}
            
        #The current trial is study.trials[-1]
        previous_trial = study.trials[-2]
        if self._current_trial is None or previous_trial.value <= self._current_trial.value:
            probability = 1.0
        else:
            probability = np.exp((self._current_trial.value - previous_trial.value) / self._temperature)
        self._temperature *= 0.99
        
        if self._rng.uniform(0, 1) < probability:
            self._current_trial = previous_trial
            
        if self._state is None:
            self._state = np.empty([9, 9], dtype=np.int32)
            for i, j in product(range(9), repeat=2):
                name = 'p{}{}'.format(i, j)
                if name in preset:
                    self._state[i, j] = preset[name]
                    
        for i, j in product(range(9), repeat=2):
            name = 'p{}{}'.format(i, j)
            if name in self._current_trial.params:
                self._state[i, j] = self._current_trial.params[name]
                
        params = {}
        name_list = shuffle(list(search_space))
        for param_name in name_list:
            i, j = int(param_name[1]), int(param_name[2])
            mask = self._mask_array[i, j]
            tmp = self._state * mask
            cost = np.asarray([np.count_nonzero(tmp == v) for v in range(1, 10)])
            probability = softmax(-cost * 5)
            
            new_value = np.random.choice(9, p=probability) + 1
            params[param_name] = new_value
            self._state[i, j] = new_value
            
        return params
        
    def infer_relative_search_space(self, study, trial):
        return optuna.samplers.intersection_search_space(study)
        
    def sample_independent(self, study, trial, param_name, param_distribution):
        independent_sampler = optuna.samplers.RandomSampler()
        return independent_sampler.sample_independent(study, trial, param_name, param_distribution)
        
        
def make_mask_array():
    mask_array = np.zeros([9, 9, 9, 9])
    for i, j in product(range(9), repeat=2):
        mask = mask_array[i, j]
        
        mask[i] = 1
        mask[:, j] = 1
        s, t = i // 3 * 3, j // 3 * 3
        mask[s:s + 3, t:t + 3] = 1
        #Get rid of yourself
        mask[i, j] = 0
    return mask_array
    
    
"""
 -----------------        
|5|3| | |7| | | | |
|-+-+-+-+-+-+-+-+-|
|6| | |1|9|5| | | |
|-+-+-+-+-+-+-+-+-|
| |9|8| | | | |6| |
|-+-+-+-+-+-+-+-+-|
|8| | | |6| | | |3|
|-+-+-+-+-+-+-+-+-|
|4| | |8| |3| | |1|
|-+-+-+-+-+-+-+-+-|
|7| | | |2| | | |6|
|-+-+-+-+-+-+-+-+-|
| |6| | | | |2|8| |
|-+-+-+-+-+-+-+-+-|
| | | |4|1|9| | |5|
|-+-+-+-+-+-+-+-+-|
| | | | |8| | |7|9|
 -----------------        
"""
preset = {'p00': 5, 'p01': 3, 'p04': 7,
          'p10': 6, 'p13': 1, 'p14': 9, 'p15': 5,
          'p21': 9, 'p22': 8, 'p27': 6,
          'p30': 8, 'p34': 6, 'p38': 3,
          'p40': 4, 'p43': 8, 'p45': 3, 'p48': 1,
          'p50': 7, 'p54': 2, 'p58': 6,
          'p61': 6, 'p66': 2, 'p67': 8,
          'p73': 4, 'p74': 1, 'p75': 9, 'p78': 5,
          'p84': 8, 'p87': 7, 'p88': 9}


def evaluate(answer):
    tmp = np.reshape(answer, [3, 3, 3, 3])
    loss = np.sum((
        np.sum([np.count_nonzero(np.logical_not(np.any(answer == i, axis=0))) for i in range(1, 10)]),
        np.sum([np.count_nonzero(np.logical_not(np.any(answer == i, axis=1))) for i in range(1, 10)]),
        np.sum([np.count_nonzero(np.logical_not(np.any(tmp == i, axis=(1, 3)))) for i in range(1, 10)]),
    ))
    return loss
    
    
def objective(trial):
    candidate = (1, 2, 3, 4, 5, 6, 7, 8, 9)
    
    answer = np.empty([9, 9], dtype=np.uint8)
    for i, j in product(range(9), repeat=2):
        key = 'p{}{}'.format(i, j)
        if key in preset:
            answer[i, j] = preset[key]
        else:
            answer[i, j] = trial.suggest_categorical(key, candidate)
            
    return evaluate(answer)
    
    
def run(n_trials):
    study_name = 'sudoku'
    sampler = SimulatedAnnealingSampler()
    study = optuna.create_study(study_name=study_name, storage='sqlite:///sudoku.db', load_if_exists=True,
                                 sampler=sampler)
    study.optimize(objective, n_trials=n_trials)
    
    show_result(study.best_params, study.best_value)
    
    df = study.trials_dataframe()
    df.to_csv('tpe_result.csv')


def show_result(best_params, best_value):
    for i in range(9):
        for j in range(9):
            key = 'p{}{}'.format(i, j)
            if key in preset:
                print('*{:1d}'.format(preset[key]), end='')
            else:
                print('{:2d}'.format(best_params[key]), end='')
        print('')
    print('loss: {}'.format(best_value))
        

@click.command()
@click.option('--n-trials', type=int, default=1000)
def cmd(n_trials):
    run(n_trials)
    
    
def main():
    cmd()
    
    
if __name__ == '__main__':
    main()

[PYTHON] I relaxed the conditions a little and let optuna solve Sudoku