Metrics

In Auto-sklearn, model is optimized over a metric, either built-in or custom metric. Moreover, it is also possible to calculate multiple metrics per run. The following examples show how to calculate metrics built-in and self-defined metrics for a classification problem.

import autosklearn.classification
import numpy as np
import pandas as pd
import sklearn.datasets
import sklearn.metrics
from autosklearn.metrics import balanced_accuracy, precision, recall, f1


def error(solution, prediction):
    # custom function defining error
    return np.mean(solution != prediction)


def get_metric_result(cv_results):
    results = pd.DataFrame.from_dict(cv_results)
    results = results[results['status'] == "Success"]
    cols = ['rank_test_scores', 'param_classifier:__choice__', 'mean_test_score']
    cols.extend([key for key in cv_results.keys() if key.startswith('metric_')])
    return results[cols]

Data Loading

X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = \
    sklearn.model_selection.train_test_split(X, y, random_state=1)

Build and fit a classifier

error_rate = autosklearn.metrics.make_scorer(
    name='custom_error',
    score_func=error,
    optimum=0,
    greater_is_better=False,
    needs_proba=False,
    needs_threshold=False
)
cls = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=120,
    per_run_time_limit=30,
    scoring_functions=[balanced_accuracy, precision, recall, f1, error_rate]
)
cls.fit(X_train, y_train, X_test, y_test)

Out:

/home/runner/work/auto-sklearn/auto-sklearn/autosklearn/metalearning/metalearning/meta_base.py:68: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  self.metafeatures = self.metafeatures.append(metafeatures)
/home/runner/work/auto-sklearn/auto-sklearn/autosklearn/metalearning/metalearning/meta_base.py:72: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  self.algorithm_runs[metric].append(runs)

AutoSklearnClassifier(per_run_time_limit=30,
                      scoring_functions=[balanced_accuracy, precision, recall,
                                         f1, custom_error],
                      time_left_for_this_task=120)

Get the Score of the final ensemble

predictions = cls.predict(X_test)
print("Accuracy score", sklearn.metrics.accuracy_score(y_test, predictions))

print("#" * 80)
print("Metric results")
print(get_metric_result(cls.cv_results_).to_string(index=False))

Out:

Accuracy score 0.958041958041958
################################################################################
Metric results
 rank_test_scores param_classifier:__choice__  mean_test_score  metric_balanced_accuracy  metric_precision  metric_recall  metric_f1  metric_custom_error
                4               random_forest         0.971631                  0.969533          0.977528       0.977528   0.977528             0.028369
                4                         mlp         0.971631                  0.961538          0.956989       1.000000   0.978022             0.028369
               26                         mlp         0.943262                  0.935069          0.945055       0.966292   0.955556             0.056738
               15               random_forest         0.964539                  0.959918          0.966667       0.977528   0.972067             0.035461
                4                         mlp         0.971631                  0.961538          0.956989       1.000000   0.978022             0.028369
                1                 extra_trees         0.985816                  0.984767          0.988764       0.988764   0.988764             0.014184
               15               random_forest         0.964539                  0.963915          0.977273       0.966292   0.971751             0.035461
               20                 extra_trees         0.957447                  0.954300          0.966292       0.966292   0.966292             0.042553
                4               random_forest         0.971631                  0.969533          0.977528       0.977528   0.977528             0.028369
                4               random_forest         0.971631                  0.969533          0.977528       0.977528   0.977528             0.028369
               15           gradient_boosting         0.964539                  0.963915          0.977273       0.966292   0.971751             0.035461
                4           gradient_boosting         0.971631                  0.965536          0.967033       0.988764   0.977778             0.028369
                4                         mlp         0.971631                  0.965536          0.967033       0.988764   0.977778             0.028369
               22                         mlp         0.950355                  0.948682          0.965909       0.955056   0.960452             0.049645
                2           gradient_boosting         0.978723                  0.975151          0.977778       0.988764   0.983240             0.021277
               15           gradient_boosting         0.964539                  0.959918          0.966667       0.977528   0.972067             0.035461
               15               random_forest         0.964539                  0.959918          0.966667       0.977528   0.972067             0.035461
                4                 extra_trees         0.971631                  0.969533          0.977528       0.977528   0.977528             0.028369
               30          passive_aggressive         0.921986                  0.894231          0.890000       1.000000   0.941799             0.078014
                2                 extra_trees         0.978723                  0.975151          0.977778       0.988764   0.983240             0.021277
                4           gradient_boosting         0.971631                  0.965536          0.967033       0.988764   0.977778             0.028369
               22                         mlp         0.950355                  0.940687          0.945652       0.977528   0.961326             0.049645
               29               random_forest         0.929078                  0.923833          0.943820       0.943820   0.943820             0.070922
               20                    adaboost         0.957447                  0.950303          0.956044       0.977528   0.966667             0.042553
                4                 extra_trees         0.971631                  0.965536          0.967033       0.988764   0.977778             0.028369
                4                 extra_trees         0.971631                  0.969533          0.977528       0.977528   0.977528             0.028369
               33                         lda         0.794326                  0.749136          0.788462       0.921348   0.849741             0.205674
               32                 gaussian_nb         0.858156                  0.871651          0.948052       0.820225   0.879518             0.141844
               22                    adaboost         0.950355                  0.952679          0.976744       0.943820   0.960000             0.049645
               26                 gaussian_nb         0.943262                  0.927074          0.926316       0.988764   0.956522             0.056738
               31         k_nearest_neighbors         0.907801                  0.882995          0.887755       0.977528   0.930481             0.092199
               26               decision_tree         0.943262                  0.947061          0.976471       0.932584   0.954023             0.056738
               22                 extra_trees         0.950355                  0.936690          0.936170       0.988764   0.961749             0.049645

Total running time of the script: ( 1 minutes 57.127 seconds)

Gallery generated by Sphinx-Gallery