[PYTHON] Rechercher efficacement les paramètres optimaux (Optuna)

C'est une histoire courante pour moi d'essayer Optuna, une méthode de recherche par hyperparamètres qui serait meilleure que la recherche par grille. Le code ci-dessous fonctionne tous sur Google Colaboratory.

Installation d'Optuna

Je n'avais pas Optuna sur Google Colaboratory alors je l'ai installé. Facile.

!pip install optuna
Collecting optuna
[?25l  Downloading https://files.pythonhosted.org/packages/d4/6a/4d80b3014797cf318a5252afb27031e9e7502854fb7930f27db0ee10bb75/optuna-0.19.0.tar.gz (126kB)
      |████████████████████████████████| 133kB 4.9MB/s 
[?25hCollecting alembic
[?25l  Downloading https://files.pythonhosted.org/packages/84/64/493c45119dce700a4b9eeecc436ef9e8835ab67bae6414f040cdc7b58f4b/alembic-1.3.1.tar.gz (1.1MB)
      |████████████████████████████████| 1.1MB 42.4MB/s 
[?25hCollecting cliff
[?25l  Downloading https://files.pythonhosted.org/packages/f6/a9/e976ba91e57043c4b6add2c394e6d1ffc26712c694379c3fe72f942d2440/cliff-2.16.0-py2.py3-none-any.whl (78kB)
      |████████████████████████████████| 81kB 8.5MB/s 
[?25hCollecting colorlog
  Downloading https://files.pythonhosted.org/packages/68/4d/892728b0c14547224f0ac40884e722a3d00cb54e7a146aea0b3186806c9e/colorlog-4.0.2-py2.py3-none-any.whl
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from optuna) (1.17.4)
Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from optuna) (1.3.3)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from optuna) (1.12.0)
Requirement already satisfied: sqlalchemy>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from optuna) (1.3.11)
Requirement already satisfied: tqdm in /usr/local/lib/python3.6/dist-packages (from optuna) (4.28.1)
Requirement already satisfied: typing in /usr/local/lib/python3.6/dist-packages (from optuna) (3.6.6)
Collecting Mako
[?25l  Downloading https://files.pythonhosted.org/packages/b0/3c/8dcd6883d009f7cae0f3157fb53e9afb05a0d3d33b3db1268ec2e6f4a56b/Mako-1.1.0.tar.gz (463kB)
      |████████████████████████████████| 471kB 37.3MB/s 
[?25hCollecting python-editor>=0.3
  Downloading https://files.pythonhosted.org/packages/c6/d3/201fc3abe391bbae6606e6f1d598c15d367033332bd54352b12f35513717/python_editor-1.0.4-py3-none-any.whl
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.6/dist-packages (from alembic->optuna) (2.6.1)
Requirement already satisfied: PyYAML>=3.12 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) (3.13)
Collecting cmd2!=0.8.3,<0.9.0,>=0.8.0
[?25l  Downloading https://files.pythonhosted.org/packages/e9/40/a71caa2aaff10c73612a7106e2d35f693e85b8cf6e37ab0774274bca3cf9/cmd2-0.8.9-py2.py3-none-any.whl (53kB)
      |████████████████████████████████| 61kB 7.7MB/s 
[?25hCollecting pbr!=2.1.0,>=2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/7a/db/a968fd7beb9fe06901c1841cb25c9ccb666ca1b9a19b114d1bbedf1126fc/pbr-5.4.4-py2.py3-none-any.whl (110kB)
      |████████████████████████████████| 112kB 36.3MB/s 
[?25hRequirement already satisfied: pyparsing>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) (2.4.5)
Collecting stevedore>=1.20.0
[?25l  Downloading https://files.pythonhosted.org/packages/b1/e1/f5ddbd83f60b03f522f173c03e406c1bff8343f0232a292ac96aa633b47a/stevedore-1.31.0-py2.py3-none-any.whl (43kB)
      |████████████████████████████████| 51kB 6.5MB/s 
[?25hRequirement already satisfied: PrettyTable<0.8,>=0.7.2 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) (0.7.2)
Requirement already satisfied: MarkupSafe>=0.9.2 in /usr/local/lib/python3.6/dist-packages (from Mako->alembic->optuna) (1.1.1)
Requirement already satisfied: wcwidth; sys_platform != "win32" in /usr/local/lib/python3.6/dist-packages (from cmd2!=0.8.3,<0.9.0,>=0.8.0->cliff->optuna) (0.1.7)
Collecting pyperclip
  Downloading https://files.pythonhosted.org/packages/2d/0f/4eda562dffd085945d57c2d9a5da745cfb5228c02bc90f2c74bbac746243/pyperclip-1.7.0.tar.gz
Building wheels for collected packages: optuna, alembic, Mako, pyperclip
  Building wheel for optuna (setup.py) ... [?25l[?25hdone
  Created wheel for optuna: filename=optuna-0.19.0-cp36-none-any.whl size=170198 sha256=fdc7777d7454f3419bc9acfd4f83f5cf6f23f0d6a6f392fc744afb597484f156
  Stored in directory: /root/.cache/pip/wheels/49/bf/47/090a43457caeff74397397da1c98a8aaed685257c16a5ba1f0
  Building wheel for alembic (setup.py) ... [?25l[?25hdone
  Created wheel for alembic: filename=alembic-1.3.1-py2.py3-none-any.whl size=144523 sha256=c66d5c3c4bd291757c2136352ac8d3cab450cccd0cb1005fe01211c5fa7576f4
  Stored in directory: /root/.cache/pip/wheels/b2/d4/19/5ab879d30af7cbc79e6dcc1d421795b1aa9d78f455b0412ef7
  Building wheel for Mako (setup.py) ... [?25l[?25hdone
  Created wheel for Mako: filename=Mako-1.1.0-cp36-none-any.whl size=75363 sha256=66ee5267f833ecdf52af2f6c7e8c93bb317a780d609f0765a8101383516ab29b
  Stored in directory: /root/.cache/pip/wheels/98/32/7b/a291926643fc1d1e02593e0d9e247c5a866a366b8343b7aa27
  Building wheel for pyperclip (setup.py) ... [?25l[?25hdone
  Created wheel for pyperclip: filename=pyperclip-1.7.0-cp36-none-any.whl size=8359 sha256=7e62cb6b9e2dcb8a323251caa96de1064b10e0a80baaf6b33b2052fafa34c08e
  Stored in directory: /root/.cache/pip/wheels/92/f0/ac/2ba2972034e98971c3654ece337ac61e546bdeb34ca960dc8c
Successfully built optuna alembic Mako pyperclip
Installing collected packages: Mako, python-editor, alembic, pyperclip, cmd2, pbr, stevedore, cliff, colorlog, optuna
Successfully installed Mako-1.1.0 alembic-1.3.1 cliff-2.16.0 cmd2-0.8.9 colorlog-4.0.2 optuna-0.19.0 pbr-5.4.4 pyperclip-1.7.0 python-editor-1.0.4 stevedore-1.31.0

L'installation a semblé réussie, j'ai donc commandé la déclaration d'importation suivante et vérifié l'opération.

import optuna

Exercices préparatoires

Tout d'abord, des exercices préparatoires pour comprendre le fonctionnement d'Optuna.

Minimisation de la fonction à une variable

Essayez de minimiser $ f (x) = x ^ 4 --4x ^ 3 --36x ^ 2 $.

def f(x):
    return x**4 - 4 * x ** 3 - 36 * x ** 2

Dans Optuna, la fonction objectif que vous souhaitez minimiser est définie comme suit.

def objective(trial):
    x = trial.suggest_uniform('x', -10, 10)
    return f(x)

Si vous procédez comme suit, il essaiera seulement 10 fois.

study = optuna.create_study()
study.optimize(objective, n_trials=10)
[I 2019-12-13 00:32:08,911] Finished trial#0 resulted in value: 2030.566599827237. Current best value is 2030.566599827237 with parameters: {'x': 9.815207070259166}.
[I 2019-12-13 00:32:09,021] Finished trial#1 resulted in value: 1252.1813135138896. Current best value is 1252.1813135138896 with parameters: {'x': 9.366926766768199}.
[I 2019-12-13 00:32:09,147] Finished trial#2 resulted in value: -283.8813965725701. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.
[I 2019-12-13 00:32:09,278] Finished trial#3 resulted in value: 1258.1505983061907. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.
[I 2019-12-13 00:32:09,409] Finished trial#4 resulted in value: -59.988164166655146. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.
[I 2019-12-13 00:32:09,539] Finished trial#5 resulted in value: 6493.216295606622. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.
[I 2019-12-13 00:32:09,670] Finished trial#6 resulted in value: 233.47766027651414. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.
[I 2019-12-13 00:32:09,797] Finished trial#7 resulted in value: -32.56782816991587. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.
[I 2019-12-13 00:32:09,924] Finished trial#8 resulted in value: 9713.778056852296. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.
[I 2019-12-13 00:32:10,046] Finished trial#9 resulted in value: -499.577141711988. Current best value is -499.577141711988 with parameters: {'x': 3.6629193285453887}.

Si vous vérifiez le nombre d'essais

len(study.trials)
10

Vous pouvez vérifier les paramètres qui minimisent la fonction objectif comme suit.

study.best_params
{'x': 3.6629193285453887}

La valeur minimale de la fonction objectif est obtenue comme suit.

study.best_value
-499.577141711988

Les informations de l'essai ayant obtenu la valeur minimale sont obtenues de cette manière.

study.best_trial
FrozenTrial(number=9, state=TrialState.COMPLETE, value=-499.577141711988, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 926514), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 10, 46429), params={'x': 3.6629193285453887}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 9}, intermediate_values={}, trial_id=9)

L'histoire des tentatives peut être vue de cette manière.

study.trials
[FrozenTrial(number=0, state=TrialState.COMPLETE, value=2030.566599827237, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 8, 821843), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 8, 911548), params={'x': 9.815207070259166}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 0}, intermediate_values={}, trial_id=0),
 FrozenTrial(number=1, state=TrialState.COMPLETE, value=1252.1813135138896, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 8, 912983), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 20790), params={'x': 9.366926766768199}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 1}, intermediate_values={}, trial_id=1),
 FrozenTrial(number=2, state=TrialState.COMPLETE, value=-283.8813965725701, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 22532), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 147430), params={'x': 2.6795376432294855}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 2}, intermediate_values={}, trial_id=2),
 FrozenTrial(number=3, state=TrialState.COMPLETE, value=1258.1505983061907, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 149953), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 277900), params={'x': 9.37074944280344}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 3}, intermediate_values={}, trial_id=3),
 FrozenTrial(number=4, state=TrialState.COMPLETE, value=-59.988164166655146, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 281543), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 409038), params={'x': -1.4636181925092284}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 4}, intermediate_values={}, trial_id=4),
 FrozenTrial(number=5, state=TrialState.COMPLETE, value=6493.216295606622, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 410813), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 539381), params={'x': -8.979003291609324}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 5}, intermediate_values={}, trial_id=5),
 FrozenTrial(number=6, state=TrialState.COMPLETE, value=233.47766027651414, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 542239), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 669699), params={'x': -5.01912242330347}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 6}, intermediate_values={}, trial_id=6),
 FrozenTrial(number=7, state=TrialState.COMPLETE, value=-32.56782816991587, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 671679), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 797441), params={'x': -1.027752432268013}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 7}, intermediate_values={}, trial_id=7),
 FrozenTrial(number=8, state=TrialState.COMPLETE, value=9713.778056852296, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 799124), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 924648), params={'x': -9.843104909274034}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 8}, intermediate_values={}, trial_id=8),
 FrozenTrial(number=9, state=TrialState.COMPLETE, value=-499.577141711988, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 926514), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 10, 46429), params={'x': 3.6629193285453887}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 9}, intermediate_values={}, trial_id=9)]

Essayons encore 100 fois.

study.optimize(objective, n_trials=100)
[I 2019-12-13 00:32:10,303] Finished trial#10 resulted in value: -679.8303609251094. Current best value is -679.8303609251094 with parameters: {'x': 4.482235669344949}.
[I 2019-12-13 00:32:10,447] Finished trial#11 resulted in value: -664.7000624843927. Current best value is -679.8303609251094 with parameters: {'x': 4.482235669344949}.
[I 2019-12-13 00:32:10,579] Finished trial#12 resulted in value: -778.9261500173968. Current best value is -778.9261500173968 with parameters: {'x': 5.024746639127292}.

... (Omis) ... [I 2019-12-13 00:32:22,591] Finished trial#107 resulted in value: -760.7787838740135. Current best value is -863.9855798856751 with parameters: {'x': 6.011542730094907}. [I 2019-12-13 00:32:22,724] Finished trial#108 resulted in value: -773.0113811629133. Current best value is -863.9855798856751 with parameters: {'x': 6.011542730094907}. [I 2019-12-13 00:32:22,862] Finished trial#109 resulted in value: -577.7178004902428. Current best value is -863.9855798856751 with parameters: {'x': 6.011542730094907}.

Maintenant le nombre d'essais

len(study.trials)
110

Les paramètres qui minimisent la fonction objectif et la valeur de la fonction objectif à ce moment sont

study.best_params, study.best_value
({'x': 6.011542730094907}, -863.9855798856751)

L'historique des valeurs de la fonction objectif peut être visualisé comme suit.

%matplotlib inline
import matplotlib.pyplot as plt

plt.plot([trial.value for trial in study.trials])
plt.grid()
plt.show()

output_15_0.png

L'historique des paramètres peut être visualisé comme suit.

%matplotlib inline
import matplotlib.pyplot as plt

plt.plot([trial.params['x'] for trial in study.trials])
plt.grid()
plt.show()

output_16_0.png

Illustrons comment la recherche de paramètres a été effectuée.

%matplotlib inline
import matplotlib.pyplot as plt

plt.grid()
plt.plot([trial.params['x'] for trial in study.trials], 
         [trial.value for trial in study.trials],
         marker='x', alpha=0.3)
plt.scatter(study.trials[0].params['x'], study.trials[0].value, 
         marker='>', label='start', s=100)
plt.scatter(study.trials[-1].params['x'], study.trials[-1].value, 
         marker='s', label='end', s=100)
plt.scatter(study.best_params['x'], study.best_value,
         marker='o', label='best', s=100)
plt.xlabel('x')
plt.ylabel('y (value)')
plt.legend()
plt.show()

output_17_0.png

Vous pouvez voir que la zone où la valeur minimale est peu susceptible d'être attendue est définie à un niveau raisonnable et la zone sur laquelle la valeur minimale est attendue est focalisée.

Minimisation de la fonction à deux variables

Ensuite, minimisons $ f (x, y) = (x --2,5) ^ 2 + 2 (y + 2,5) ^ 2 $.

def f(x, y):
    return (x - 2.5)**2 + 2 * (y + 2.5) ** 2

Définition de la fonction que vous souhaitez minimiser

def objective(trial):
    x = trial.suggest_uniform('x', -10, 10)
    y = trial.suggest_uniform('y', -10, 10)
    return f(x, y)

Essayez 100 fois de cette façon

study = optuna.create_study()
study.optimize(objective, n_trials=100)
[I 2019-12-13 00:32:24,001] Finished trial#0 resulted in value: 31.229461850588567. Current best value is 31.229461850588567 with parameters: {'x': 7.975371679174145, 'y': -3.290495675347522}.
[I 2019-12-13 00:32:24,131] Finished trial#1 resulted in value: 158.84900024337801. Current best value is 31.229461850588567 with parameters: {'x': 7.975371679174145, 'y': -3.290495675347522}.
[I 2019-12-13 00:32:24,252] Finished trial#2 resulted in value: 118.67648241872055. Current best value is 31.229461850588567 with parameters: {'x': 7.975371679174145, 'y': -3.290495675347522}.

... (Omis) ... [I 2019-12-13 00:32:37,321] Finished trial#97 resulted in value: 24.46020780084274. Current best value is 0.2114497716311141 with parameters: {'x': 2.286816304129357, 'y': -2.788099360851467}. [I 2019-12-13 00:32:37,471] Finished trial#98 resulted in value: 15.832787347997524. Current best value is 0.2114497716311141 with parameters: {'x': 2.286816304129357, 'y': -2.788099360851467}. [I 2019-12-13 00:32:37,625] Finished trial#99 resulted in value: 0.6493005675217599. Current best value is 0.2114497716311141 with parameters: {'x': 2.286816304129357, 'y': -2.788099360851467}.

Paramètres qui minimisent la fonction objectif et la valeur minimale à ce moment

study.best_params, study.best_value
({'x': 2.286816304129357, 'y': -2.788099360851467}, 0.2114497716311141)

Valeur de la fonction objective et historique des paramètres

%matplotlib inline
import matplotlib.pyplot as plt

plt.plot([trial.value for trial in study.trials], label='value')
plt.grid()
plt.legend()
plt.show()

plt.plot([trial.params['x'] for trial in study.trials], label='x')
plt.plot([trial.params['y'] for trial in study.trials], label='y')
plt.grid()
plt.legend()
plt.show()

output_23_0.png

output_23_1.png

Illustrons l'histoire des paramètres sur un plan bidimensionnel.

%matplotlib inline
import matplotlib.pyplot as plt

plt.plot([trial.params['x'] for trial in study.trials], 
         [trial.params['y'] for trial in study.trials],
         alpha=0.4, marker='x')
plt.scatter(study.trials[0].params['x'], study.trials[0].params['y'], 
         marker='>', label='start', s=100)
plt.scatter(study.trials[-1].params['x'], study.trials[-1].params['y'], 
         marker='s', label='end', s=100)
plt.scatter(study.best_params['x'], study.best_params['y'],
         marker='o', label='best', s=100)
plt.grid()
plt.legend()
plt.show()

output_24_0.png

Encore une fois, nous pouvons voir que nous nous sommes concentrés sur les domaines où nous pouvions nous attendre à la valeur minimale, avec les domaines où nous ne pouvions pas nous y attendre.

Ce qui précède est un exercice préparatoire pour comprendre facilement le fonctionnement d'Optuna.

Application à l'apprentissage automatique supervisé

À partir de là, nous appliquerons cela au réglage des hyperparamètres de l'apprentissage automatique supervisé.

Ensemble de données sur le cancer du sein

En utilisant les ensembles de données sur le cancer du sein de la bibliothèque d'apprentissage automatique Scikit-learn comme exemple, nous obtenons la variable explicative $ X $ et la variable objective $ y $ comme suit.

# https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html
from sklearn.datasets import load_breast_cancer
breast_cancer = load_breast_cancer()
X = breast_cancer.data
y = breast_cancer.target.ravel()

Divisez en données d'entraînement et données de test.

from sklearn.model_selection import train_test_split 
#Vers les données d'entraînement / les données de test 6:Divisé au hasard par un rapport de 4
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4) 

Recherche de grille pour comparaison

En comparaison avec Optuna, nous examinerons la recherche de grille. La recherche de grille essaie toutes les combinaisons des "candidats pour les valeurs de paramètres" et sélectionne celle qui présente les meilleures performances. En utilisant lightGBM comme exemple d'apprentissage automatique supervisé, cela ressemble à ceci:

%%time
from sklearn.model_selection import GridSearchCV

# LightGBM
import lightgbm as lgb

#Paramètres pour effectuer une recherche de grille
parameters = [{
    'learning_rate':[0.1,0.2],
    'n_estimators':[20,100,200],
    'max_depth':[3,5,7,9],
    'min_child_weight':[0.5,1,2],
    'min_child_samples':[5,10,20],
    'subsample':[0.8],
    'colsample_bytree':[0.8],
    'verbose':[-1],
    'num_leaves':[80]
}]

#Exécution de la recherche de grille
classifier = GridSearchCV(lgb.LGBMClassifier(), parameters, cv=3, n_jobs=-1)
classifier.fit(X_train, y_train)
print("Accuracy score (train): ", classifier.score(X_train, y_train))
print("Accuracy score (test): ", classifier.score(X_test, y_test))
print(classifier.best_estimator_) #Meilleurs paramètres
Accuracy score (train):  1.0
Accuracy score (test):  0.9517543859649122
LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=0.8,
               importance_type='split', learning_rate=0.1, max_depth=3,
               min_child_samples=20, min_child_weight=0.5, min_split_gain=0.0,
               n_estimators=100, n_jobs=-1, num_leaves=80, objective=None,
               random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=True,
               subsample=0.8, subsample_for_bin=200000, subsample_freq=0,
               verbose=-1)
CPU times: user 1.15 s, sys: 109 ms, total: 1.26 s
Wall time: 15.3 s

Les fonctionnalités de la recherche de grille

Donc, je ne me concentrerai pas sur la recherche de ce à quoi je peux m'attendre.

LightGBM + Optuna

Maintenant, réglons ce LightGBM avec Optuna au lieu de la recherche par grille.

import numpy as np
#Fonction objective
def objective(trial):
    learning_rate = trial.suggest_loguniform('learning_rate', 0.1,0.2),
    n_estimators, = trial.suggest_int('n_estimators', 20, 200),
    max_depth, = trial.suggest_int('max_depth', 3, 9),
    min_child_weight = trial.suggest_loguniform('min_child_weight', 0.5, 2),
    min_child_samples, = trial.suggest_int('min_child_samples', 5, 20),
    classifier = lgb.LGBMClassifier(learning_rate=learning_rate, 
                                    n_estimators=n_estimators,
                                    max_depth=max_depth, 
                                    min_child_weight=min_child_weight,
                                    min_child_samples=min_child_samples,
                                    subsample=0.8, colsample_bytree=0.8,
                                    verbose=-1, num_leaves=80)
    classifier.fit(X_train, y_train)
    #return classifier.score(X_train, y_train) #Optimisation du taux de réponse correcte (train)
    return np.linalg.norm(y_train - classifier.predict_proba(X_train)[:, 1], ord=1) #Optimiser la probabilité

Je pense qu'il y a un choix quant à ce qu'il faut optimiser avec la fonction ci-dessus. Le lightGBM utilisé est un apprenant qui «classe» plutôt que «retourne».

L'utilisation de classifier.score (X_train, y_train) optimisera le taux de réponse correct (train). Le taux de réponse correct dans la classification est le nombre de données correctement classées, il ressemble donc à une valeur continue et est en fait un nombre proche d'une valeur discrète. Par exemple, si 8 sur 10 sont correctement classés, le taux de réponse correcte (train) ne change pas, que le classement soit «réponse correcte avec marge» ou «réponse correcte à la dernière minute». En d'autres termes, il est difficile pour le pouvoir de passer de la «réponse à peine correcte» à la «réponse correcte marginale».

Cela peut être évité en utilisant np.linalg.norm (y_train --classifier.predict_proba (X_train) [:, 1], ord = 1). En cela, y_train est un ensemble d'enseignants indiquant si le résultat de la classification est 0 ou 1, et classifier.predict_proba (X_train) [:, 1] est sa propre force (correspondant à la probabilité) que la prédiction soit 1. )est. En minimisant la norme L1 (je pense que la norme L2 est bien) de la différence entre ces deux valeurs, le pouvoir de rapprocher la «réponse incorrecte» de la «bonne réponse» et la «réponse à peine correcte» de la «réponse correcte marginale» Facilitez le travail.

Maintenant, commençons à apprendre. Si vous utilisez classifier.score (X_train, y_train), choisissez de le maximiser. Si vous utilisez np.linalg.norm (y_train --classifier.predict_proba (X_train) [:, 1], ord = 1), choisissez sa minimisation.

#study = optuna.create_study(direction='maximize') #Maximiser
study = optuna.create_study(direction='minimize') #Minimiser

Essayez 100 fois de cette façon

study.optimize(objective, n_trials=100)
[I 2019-12-13 00:32:54,913] Finished trial#0 resulted in value: 1.5655193925176527. Current best value is 1.5655193925176527 with parameters: {'learning_rate': 0.11563458547060446, 'n_estimators': 155, 'max_depth': 7, 'min_child_weight': 0.7324812463494225, 'min_child_samples': 12}.
[I 2019-12-13 00:32:55,103] Finished trial#1 resulted in value: 1.3810988452320123. Current best value is 1.3810988452320123 with parameters: {'learning_rate': 0.15351688726053717, 'n_estimators': 83, 'max_depth': 6, 'min_child_weight': 0.5802652538400225, 'min_child_samples': 8}.
[I 2019-12-13 00:32:55,287] Finished trial#2 resulted in value: 3.519787362063691. Current best value is 1.3810988452320123 with parameters: {'learning_rate': 0.15351688726053717, 'n_estimators': 83, 'max_depth': 6, 'min_child_weight': 0.5802652538400225, 'min_child_samples': 8}.

... (Omis) ... [I 2019-12-13 00:33:17,608] Finished trial#97 resulted in value: 1.0443245090791662. Current best value is 1.0230542364962214 with parameters: {'learning_rate': 0.11851649444429455, 'n_estimators': 176, 'max_depth': 9, 'min_child_weight': 0.50006741615294, 'min_child_samples': 8}. [I 2019-12-13 00:33:17,871] Finished trial#98 resulted in value: 1.3997762969822483. Current best value is 1.0230542364962214 with parameters: {'learning_rate': 0.11851649444429455, 'n_estimators': 176, 'max_depth': 9, 'min_child_weight': 0.50006741615294, 'min_child_samples': 8}. [I 2019-12-13 00:33:18,187] Finished trial#99 resulted in value: 1.1059309199723422. Current best value is 1.0230542364962214 with parameters: {'learning_rate': 0.11851649444429455, 'n_estimators': 176, 'max_depth': 9, 'min_child_weight': 0.50006741615294, 'min_child_samples': 8}.

Les paramètres qui optimisent la fonction objectif sont

study.best_params
{'learning_rate': 0.11851649444429455,
 'max_depth': 9,
 'min_child_samples': 8,
 'min_child_weight': 0.50006741615294,
 'n_estimators': 176}

Il est peu probable que la recherche de grille demande une valeur aussi mauvaise.

Et la valeur optimale à ce moment-là est

study.best_value
1.0230542364962214

Les paramètres optimaux obtenus peuvent être remplacés en utilisant «** study.best_params» pour créer un classificateur accordé.

classifier = lgb.LGBMClassifier(**study.best_params,
                                subsample=0.8, colsample_bytree=0.8,
                                verbose=-1, num_leaves=80)
classifier
LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=0.8,
               importance_type='split', learning_rate=0.11851649444429455,
               max_depth=9, min_child_samples=8,
               min_child_weight=0.50006741615294, min_split_gain=0.0,
               n_estimators=176, n_jobs=-1, num_leaves=80, objective=None,
               random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=True,
               subsample=0.8, subsample_for_bin=200000, subsample_freq=0,
               verbose=-1)

Apprenez avec le meilleur classificateur

classifier.fit(X_train, y_train)
LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=0.8,
               importance_type='split', learning_rate=0.11851649444429455,
               max_depth=9, min_child_samples=8,
               min_child_weight=0.50006741615294, min_split_gain=0.0,
               n_estimators=176, n_jobs=-1, num_leaves=80, objective=None,
               random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=True,
               subsample=0.8, subsample_for_bin=200000, subsample_freq=0,
               verbose=-1)

Et prédire

classifier.score(X_train, y_train)
1.0
classifier.score(X_test, y_test)
0.9473684210526315

Histoire des valeurs des fonctions objectives

plt.plot([trial.value for trial in study.trials], label='value')
plt.grid()
plt.legend()
plt.show()

output_40_0.png

Historique de recherche de paramètres

for key in study.trials[0].params.keys():
    plt.plot([trial.params[key] for trial in study.trials], label=key)
    plt.grid()
    plt.legend()
    plt.show()

output_41_0.png

output_41_1.png

output_41_2.png

output_41_3.png

output_41_4.png

C'est comme ça.

scikit-learn/MLP + Optuna

De même, réglons le perceptron multicouche scikit-learn. D'abord à partir de la recherche de grille.

%%time
from sklearn.model_selection import GridSearchCV

#Perceptron multicouche
from sklearn.neural_network import MLPClassifier
#Paramètres pour effectuer une recherche de grille
parameters = [{'hidden_layer_sizes': [8, 16, 32, (8, 8), (8, 8, 8)], 
               'solver': ['adam'], 'activation': ['relu'],
              'learning_rate_init': [0.1, 0.01, 0.001]}]
#Exécution de la recherche de grille
classifier = GridSearchCV(MLPClassifier(max_iter=10000, early_stopping=True), 
                          parameters, cv=3, n_jobs=-1)
classifier.fit(X_train, y_train)
print("Accuracy score (train): ", classifier.score(X_train, y_train))
print("Accuracy score (test): ", classifier.score(X_test, y_test))
print(classifier.best_estimator_) #Classificateur avec les meilleurs paramètres
Accuracy score (train):  0.9090909090909091
Accuracy score (test):  0.8728070175438597
MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
              beta_2=0.999, early_stopping=True, epsilon=1e-08,
              hidden_layer_sizes=32, learning_rate='constant',
              learning_rate_init=0.1, max_iter=10000, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)
CPU times: user 166 ms, sys: 30.3 ms, total: 196 ms
Wall time: 3.07 s

Avec Perceptron multicouche

Parce qu'il y a un facteur appelé, c'est un peu difficile.

Correction du calque caché à 1 calque

#Fonction objective
def objective(trial):
    hidden_layer_sizes, = trial.suggest_int('hidden_layer_sizes', 8, 100),
    learning_rate_init, = trial.suggest_loguniform('learning_rate_init', 0.001, 0.1),
    classifier = MLPClassifier(max_iter=10000, early_stopping=True,
                                    hidden_layer_sizes=hidden_layer_sizes,
                                    learning_rate_init=learning_rate_init, 
                                    solver='adam', activation='relu')
    classifier.fit(X_train, y_train)
    #return classifier.score(X_train, y_train)
    #return classifier.score(X_test, y_test)
    return np.linalg.norm(y_train - classifier.predict_proba(X_train)[:, 1], ord=1)
#study = optuna.create_study(direction='maximize')
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)
[I 2019-12-13 00:33:23,314] Finished trial#0 resulted in value: 33.67375867333378. Current best value is 33.67375867333378 with parameters: {'hidden_layer_sizes': 73, 'learning_rate_init': 0.004548472515805296}.
[I 2019-12-13 00:33:23,538] Finished trial#1 resulted in value: 35.17385235930611. Current best value is 33.67375867333378 with parameters: {'hidden_layer_sizes': 73, 'learning_rate_init': 0.004548472515805296}.
[I 2019-12-13 00:33:23,716] Finished trial#2 resulted in value: 52.815452458627675. Current best value is 33.67375867333378 with parameters: {'hidden_layer_sizes': 73, 'learning_rate_init': 0.004548472515805296}.

... (Omis) ... [I 2019-12-13 00:33:47,631] Finished trial#97 resulted in value: 150.15953891394736. Current best value is 23.844866313445344 with parameters: {'hidden_layer_sizes': 79, 'learning_rate_init': 0.010242027297662661}. [I 2019-12-13 00:33:47,894] Finished trial#98 resulted in value: 32.56506872305802. Current best value is 23.844866313445344 with parameters: {'hidden_layer_sizes': 79, 'learning_rate_init': 0.010242027297662661}. [I 2019-12-13 00:33:48,172] Finished trial#99 resulted in value: 38.57363524502563. Current best value is 23.844866313445344 with parameters: {'hidden_layer_sizes': 79, 'learning_rate_init': 0.010242027297662661}.

study.best_params
{'hidden_layer_sizes': 79, 'learning_rate_init': 0.010242027297662661}
study.best_value
23.844866313445344
classifier = MLPClassifier(**study.best_params)
classifier.fit(X_train, y_train)
classifier.score(X_train, y_train), classifier.score(X_test, y_test)
(0.9472140762463344, 0.9122807017543859)
plt.plot([trial.value for trial in study.trials], label='score')
plt.grid()
plt.legend()
plt.show()

output_50_0.png

for key in study.trials[0].params.keys():
    plt.plot([trial.params[key] for trial in study.trials], label=key)
    plt.grid()
    plt.legend()
    plt.show()

output_51_0.png

output_51_1.png

Correction du calque caché à 2 calques

#Fonction objective
def objective(trial):
    h1, = trial.suggest_int('h1', 8, 100),
    h2, = trial.suggest_int('h2', 8, 100),
    learning_rate_init, = trial.suggest_loguniform('learning_rate_init', 0.001, 0.1),
    classifier = MLPClassifier(max_iter=10000, early_stopping=True,
                                    hidden_layer_sizes=(h1, h2),
                                    learning_rate_init=learning_rate_init, 
                                    solver='adam', activation='relu')
    classifier.fit(X_train, y_train)
    #return classifier.score(X_train, y_train)
    #return classifier.score(X_test, y_test)
    return np.linalg.norm(y_train - classifier.predict_proba(X_train)[:, 1], ord=1)
#study = optuna.create_study(direction='maximize')
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)
[I 2019-12-13 00:33:49,555] Finished trial#0 resulted in value: 44.26353774856942. Current best value is 44.26353774856942 with parameters: {'h1': 15, 'h2': 99, 'learning_rate_init': 0.003018305556292618}.
[I 2019-12-13 00:33:49,851] Finished trial#1 resulted in value: 29.450960862380153. Current best value is 29.450960862380153 with parameters: {'h1': 81, 'h2': 79, 'learning_rate_init': 0.01344672244443261}.
[I 2019-12-13 00:33:50,073] Finished trial#2 resulted in value: 38.96850500173973. Current best value is 29.450960862380153 with parameters: {'h1': 81, 'h2': 79, 'learning_rate_init': 0.01344672244443261}.

... (Omis) ... [32m [I 2019-12-13 00: 34: 19,151] [0m L'essai n ° 97 terminé a donné la valeur: 34,73946747640069. La meilleure valeur actuelle est 22,729638264213385 avec les paramètres: {'h1' : 73, 'h2': 91, 'learning_rate_init': 0,005367313373989512}. [0m [I 2019-12-13 00:34:19,472] Finished trial#98 resulted in value: 38.708695477563566. Current best value is 22.729638264213385 with parameters: {'h1': 73, 'h2': 91, 'learning_rate_init': 0.005367313373989512}. [I 2019-12-13 00:34:19,801] Finished trial#99 resulted in value: 42.20352641425415. Current best value is 22.729638264213385 with parameters: {'h1': 73, 'h2': 91, 'learning_rate_init': 0.005367313373989512}.

study.best_params
{'h1': 73, 'h2': 91, 'learning_rate_init': 0.005367313373989512}
study.best_value
22.729638264213385

Lorsque j'essaie d'utiliser les meilleurs paramètres dans ** study.best_params, j'obtiens l'erreur suivante: Je ne connais pas de solution facile pour l'instant, donc je ne peux que penser à substituer les paramètres obtenus de manière simple.

classifier = MLPClassifier(**study.best_params)
classifier.fit(X_train, y_train)
classifier.score(X_train, y_train), classifier.score(X_test, y_test)
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-49-ee91971a40bc> in <module>()
----> 1 classifier = MLPClassifier(**study.best_params)
      2 classifier.fit(X_train, y_train)
      3 classifier.score(X_train, y_train), classifier.score(X_test, y_test)


TypeError: __init__() got an unexpected keyword argument 'h1'

Histoire variée

plt.plot([trial.value for trial in study.trials], label='value')
plt.grid()
plt.legend()
plt.show()

output_58_0.png

for key in study.trials[0].params.keys():
    plt.plot([trial.params[key] for trial in study.trials], label=key)
    plt.grid()
    plt.legend()
    plt.show()

output_59_0.png

output_59_1.png

output_59_2.png

Ne fixez pas la profondeur de la couche

#Fonction objective
def objective(trial):
    h1, = trial.suggest_int('h1', 8, 100),
    h2, = trial.suggest_int('h2', 8, 100),
    h3, = trial.suggest_int('h3', 8, 100),
    h4, = trial.suggest_int('h4', 8, 100),
    h5, = trial.suggest_int('h5', 8, 100),
    hidden_layer_sizes = []
    n = trial.suggest_int('n', 1, 5)
    for h in [h1, h2, h3, h4, h5]:
        hidden_layer_sizes.append(h)
        if len(hidden_layer_sizes) == n:
            break
    learning_rate_init, = trial.suggest_loguniform('learning_rate_init', 0.001, 0.1),
    classifier = MLPClassifier(max_iter=10000, early_stopping=True,
                                    hidden_layer_sizes=hidden_layer_sizes,
                                    learning_rate_init=learning_rate_init, 
                                    solver='adam', activation='relu')
    classifier.fit(X_train, y_train)
    #return classifier.score(X_train, y_train)
    #return classifier.score(X_test, y_test)
    return np.linalg.norm(y_train - classifier.predict_proba(X_train)[:, 1], ord=1)
#study = optuna.create_study(direction='maximize')
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)
[I 2019-12-13 00:34:37,028] Finished trial#0 resulted in value: 117.6950339936551. Current best value is 117.6950339936551 with parameters: {'h1': 44, 'h2': 90, 'h3': 75, 'h4': 51, 'h5': 87, 'n': 3, 'learning_rate_init': 0.043829528929494495}.
[I 2019-12-13 00:34:37,247] Finished trial#1 resulted in value: 107.63845860162616. Current best value is 107.63845860162616 with parameters: {'h1': 16, 'h2': 51, 'h3': 13, 'h4': 36, 'h5': 27, 'n': 3, 'learning_rate_init': 0.04986625228277607}.
[I 2019-12-13 00:34:37,513] Finished trial#2 resulted in value: 198.86827020586986. Current best value is 107.63845860162616 with parameters: {'h1': 16, 'h2': 51, 'h3': 13, 'h4': 36, 'h5': 27, 'n': 3, 'learning_rate_init': 0.04986625228277607}.

... (Omis) ... [I 2019-12-13 00:35:10,424] Finished trial#97 resulted in value: 31.485260318520005. Current best value is 23.024826770529504 with parameters: {'h1': 62, 'h2': 60, 'h3': 58, 'h4': 77, 'h5': 27, 'n': 1, 'learning_rate_init': 0.011342241271350882}. [I 2019-12-13 00:35:10,801] Finished trial#98 resulted in value: 27.752591077771235. Current best value is 23.024826770529504 with parameters: {'h1': 62, 'h2': 60, 'h3': 58, 'h4': 77, 'h5': 27, 'n': 1, 'learning_rate_init': 0.011342241271350882}. [I 2019-12-13 00:35:11,199] Finished trial#99 resulted in value: 81.29419572506973. Current best value is 23.024826770529504 with parameters: {'h1': 62, 'h2': 60, 'h3': 58, 'h4': 77, 'h5': 27, 'n': 1, 'learning_rate_init': 0.011342241271350882}.

study.best_params
{'h1': 62,
 'h2': 60,
 'h3': 58,
 'h4': 77,
 'h5': 27,
 'learning_rate_init': 0.011342241271350882,
 'n': 1}
study.best_value
23.024826770529504
plt.plot([trial.value for trial in study.trials], label='score')
plt.grid()
plt.legend()
plt.show()

output_65_0.png

for key in study.trials[0].params.keys():
    plt.plot([trial.params[key] for trial in study.trials], label=key)
    plt.grid()
    plt.legend()
    plt.show()

output_66_0.png

output_66_1.png

output_66_2.png

output_66_3.png

output_66_4.png

output_66_5.png

output_66_6.png

PyTorch + Optuna

Similaire à ce qui précède, j'ai essayé l'optimisation du perceptron multicouche avec PyTorch + Optuna.

import torch
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

X_train = torch.from_numpy(X_train).float()
X_test = torch.from_numpy(X_test).float()
y_train = torch.from_numpy(y_train).float()
y_test = torch.from_numpy(y_test).float()

train = TensorDataset(X_train, y_train)

train_loader = DataLoader(train, batch_size=10, shuffle=True)
import torch
class MLPC(torch.nn.Module):
    def __init__(self, n_input, n_hidden1, n_output):
        super(MLPC, self).__init__()
        self.l1 = torch.nn.Linear(n_input, n_hidden1)
        self.l2 = torch.nn.Linear(n_hidden1, n_output)

    def forward(self, x):
        h1 = self.l1(x)
        h2 = torch.sigmoid(h1)
        h3 = self.l2(h2)
        h4 = torch.sigmoid(h3)
        return h4
    
    def score(self, x, y, threshold=0.5):
        accum = 0
        for y_pred, y1 in zip(self.forward(x), y):
            if y1 == 1:
                if y_pred >= threshold:
                    accum += 1
            else:
                if y_pred < threshold:
                    accum += 1
        return accum / len(y)
#Fonction objective
from torch.autograd import Variable
def objective(trial):
    n_h1, = trial.suggest_int('n_hidden1', 1, 100),
    lr, = trial.suggest_loguniform('lr', 0.001, 0.1),
    model = MLPC(len(train[0][0]), n_h1, 1)
    criterion = torch.nn.MSELoss()
    optimizer = torch.optim.SGD(model.parameters(), lr=lr)

    #loss_history = []
    n_epoch = 2000
    for epoch in range(n_epoch):
        total_loss = 0
        for x, y in train_loader:
            x = Variable(x)
            y = Variable(y)
            optimizer.zero_grad()
            y_pred = model(x)
            loss = criterion(y_pred, y)
            loss.backward()
            optimizer.step()
            total_loss += loss.item()
        #loss_history.append(total_loss)
        #if (epoch +1) % (n_epoch / 10) == 0:
        #    print(epoch + 1, total_loss)
    score_train_history.append(model.score(X_train, y_train))
    score_test_history.append(model.score(X_test, y_test))
    return total_loss # model.score(X_test, y_test)L'apprentissage ne progresse-t-il pas?

Échec

n_trials=100
score_train_history = []
score_test_history = []
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=n_trials)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:431: UserWarning:

Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:431: UserWarning:

Using a target size (torch.Size([1])) that is different to the input size (torch.Size([1, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

[I 2019-12-13 00:36:09,957] Finished trial#0 resulted in value: 8.354197099804878. Current best value is 8.354197099804878 with parameters: {'n_hidden1': 50, 'lr': 0.008643921209550006}.
[I 2019-12-13 00:37:02,256] Finished trial#1 resulted in value: 8.542565807700157. Current best value is 8.354197099804878 with parameters: {'n_hidden1': 50, 'lr': 0.008643921209550006}.
[I 2019-12-13 00:37:54,087] Finished trial#2 resulted in value: 8.721126735210419. Current best value is 8.354197099804878 with parameters: {'n_hidden1': 50, 'lr': 0.008643921209550006}.

... (Omis) ... [I 2019-12-13 01:59:43,405] Finished trial#97 resulted in value: 8.414046227931976. Current best value is 8.206612035632133 with parameters: {'n_hidden1': 82, 'lr': 0.0010109929013465883}. [I 2019-12-13 02:00:36,203] Finished trial#98 resulted in value: 8.469094559550285. Current best value is 8.206612035632133 with parameters: {'n_hidden1': 82, 'lr': 0.0010109929013465883}. [I 2019-12-13 02:01:28,698] Finished trial#99 resulted in value: 8.296677514910698. Current best value is 8.206612035632133 with parameters: {'n_hidden1': 82, 'lr': 0.0010109929013465883}.

La première est l'édition d'échec. Quelque chose d'avertissement est sorti. J'ai essayé de continuer sans m'en soucier, mais comme prévenu, la précision ne s'est pas améliorée.

study.best_params
{'lr': 0.0010109929013465883, 'n_hidden1': 82}
study.best_value
8.206612035632133
plt.plot([trial.value for trial in study.trials], label='loss')
plt.grid()
plt.legend()
plt.show()

output_74_0.png

plt.plot(score_train_history, label='score (train)')
plt.plot(score_test_history, label='score (test)')
plt.grid()
plt.legend()
plt.show()

output_75_0.png

for key in study.trials[0].params.keys():
    plt.plot([trial.params[key] for trial in study.trials], label=key)
    plt.grid()
    plt.legend()
    plt.show()

output_76_0.png

output_76_1.png

Succès

Quel était le problème avec l '"échec" ci-dessus? Message d'alerte

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:431: UserWarning:

Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:431: UserWarning:

Using a target size (torch.Size([1])) that is different to the input size (torch.Size([1, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

La signification de est que la forme de la matrice des variables objectives n'est pas bonne. Cette fois,

# https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html
from sklearn.datasets import load_breast_cancer
breast_cancer = load_breast_cancer()
X = breast_cancer.data
y = breast_cancer.target

J'ai créé la variable explicative et la variable objective comme dans, mais j'ai besoin de remodeler y en ce moment.

y = y.reshape((len(y), 1))

C'était la seule cause de l'échec. En dehors de cela, le même code exact que précédemment a amélioré la précision. Comparons-le avec le temps de l'échec.

#Méthode d'importation à diviser en données d'entraînement et données de test
from sklearn.model_selection import train_test_split 
#Vers les données d'entraînement / les données de test 6:Divisé au hasard par un rapport de 4
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4) 
import torch
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

X_train = torch.from_numpy(X_train).float()
X_test = torch.from_numpy(X_test).float()
y_train = torch.from_numpy(y_train).float()
y_test = torch.from_numpy(y_test).float()

train = TensorDataset(X_train, y_train)

train_loader = DataLoader(train, batch_size=10, shuffle=True)
import torch
class MLPC(torch.nn.Module):
    def __init__(self, n_input, n_hidden1, n_output):
        super(MLPC, self).__init__()
        self.l1 = torch.nn.Linear(n_input, n_hidden1)
        self.l2 = torch.nn.Linear(n_hidden1, n_output)

    def forward(self, x):
        h1 = self.l1(x)
        h2 = torch.sigmoid(h1)
        h3 = self.l2(h2)
        h4 = torch.sigmoid(h3)
        return h4
    
    def score(self, x, y, threshold=0.5):
        accum = 0
        for y_pred, y1 in zip(self.forward(x), y):
            if y1 == 1:
                if y_pred >= threshold:
                    accum += 1
            else:
                if y_pred < threshold:
                    accum += 1
        return accum / len(y)
#Fonction objective
from torch.autograd import Variable
def objective(trial):
    n_h1, = trial.suggest_int('n_hidden1', 1, 100),
    lr, = trial.suggest_loguniform('lr', 0.001, 0.1),
    model = MLPC(len(train[0][0]), n_h1, 1)
    criterion = torch.nn.MSELoss()
    optimizer = torch.optim.SGD(model.parameters(), lr=lr)

    #loss_history = []
    n_epoch = 2000
    for epoch in range(n_epoch):
        total_loss = 0
        for x, y in train_loader:
            #if x.shape[0] == 1:
            #    continue
            #print(x.shape, y.shape)
            x = Variable(x)
            y = Variable(y)
            optimizer.zero_grad()
            y_pred = model(x)
            loss = criterion(y_pred, y)
            loss.backward()
            optimizer.step()
            total_loss += loss.item()
        #loss_history.append(total_loss)
        #if (epoch +1) % (n_epoch / 10) == 0:
        #    print(epoch + 1, total_loss)
    score_train_history.append(model.score(X_train, y_train))
    score_test_history.append(model.score(X_test, y_test))
    return total_loss # model.score(X_test, y_test)Si vous le définissez sur, l'apprentissage ne se poursuivra pas?
n_trials=100
score_train_history = []
score_test_history = []
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=n_trials)
[I 2019-12-13 07:58:42,273] Finished trial#0 resulted in value: 7.991558387875557. Current best value is 7.991558387875557 with parameters: {'n_hidden1': 100, 'lr': 0.001719688534454947}.
[I 2019-12-13 07:59:29,221] Finished trial#1 resulted in value: 8.133784644305706. Current best value is 7.991558387875557 with parameters: {'n_hidden1': 100, 'lr': 0.001719688534454947}.
[I 2019-12-13 08:00:16,849] Finished trial#2 resulted in value: 8.075047567486763. Current best value is 7.991558387875557 with parameters: {'n_hidden1': 100, 'lr': 0.001719688534454947}.

... (Omis) ... [I 2019-12-13 09:14:47,236] Finished trial#97 resulted in value: 8.02999284863472. Current best value is 2.8610200360417366 with parameters: {'n_hidden1': 38, 'lr': 0.0010151912634053866}. [I 2019-12-13 09:15:34,106] Finished trial#98 resulted in value: 5.849344417452812. Current best value is 2.8610200360417366 with parameters: {'n_hidden1': 38, 'lr': 0.0010151912634053866}. [I 2019-12-13 09:16:20,332] Finished trial#99 resulted in value: 8.052950218319893. Current best value is 2.8610200360417366 with parameters: {'n_hidden1': 38, 'lr': 0.0010151912634053866}.

study.best_params
{'lr': 0.0010151912634053866, 'n_hidden1': 38}
study.best_value
2.8610200360417366
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([trial.value for trial in study.trials], label='loss')
plt.grid()
plt.legend()
plt.show()

output_13_0.png

plt.plot(score_train_history, label='score (train)')
plt.plot(score_test_history, label='score (test)')
plt.grid()
plt.legend()
plt.show()

output_14_0.png

for key in study.trials[0].params.keys():
    plt.plot([trial.params[key] for trial in study.trials], label=key)
    plt.grid()
    plt.legend()
    plt.show()

output_15_0.png

output_15_1.png

Recommended Posts

Rechercher efficacement les paramètres optimaux (Optuna)
Liste de recherche des éléments en double
Obtenir les paramètres de requête pour Flask GET
Rechercher le nom de la fonction OpenCV
Rechercher des chaînes dans les fichiers
Ajustement des paramètres LightGBM avec Optuna
Rechercher numpy.array pour Vrai consécutif