.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/40_advanced/example_inspect_predictions.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_40_advanced_example_inspect_predictions.py: ================= Model Explanation ================= The following example shows how to fit a simple classification model with *auto-sklearn* and use the `inspect module `_ from scikit-learn to understand what affects the predictions. .. GENERATED FROM PYTHON SOURCE LINES 11-17 .. code-block:: default import sklearn.datasets from sklearn.inspection import plot_partial_dependence, permutation_importance import matplotlib.pyplot as plt import autosklearn.classification .. GENERATED FROM PYTHON SOURCE LINES 18-25 Load Data and Build a Model =========================== We start by loading the "Run or walk" dataset from OpenML and train an auto-sklearn model on it. For this dataset, the goal is to predict whether a person is running or walking based on accelerometer and gyroscope data collected by a phone. For more information see `here `_. .. GENERATED FROM PYTHON SOURCE LINES 25-48 .. code-block:: default dataset = sklearn.datasets.fetch_openml(data_id=40922) # Note: To speed up the example, we subsample the dataset dataset.data = dataset.data.sample(n=5000, random_state=1, axis="index") dataset.target = dataset.target[dataset.data.index] X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split( dataset.data, dataset.target, test_size=0.3, random_state=1 ) automl = autosklearn.classification.AutoSklearnClassifier( time_left_for_this_task=120, per_run_time_limit=30, tmp_folder="/tmp/autosklearn_inspect_predictions_example_tmp", ) automl.fit(X_train, y_train, dataset_name="Run_or_walk_information") s = automl.score(X_train, y_train) print(f"Train score {s}") s = automl.score(X_test, y_test) print(f"Test score {s}") .. rst-class:: sphx-glr-script-out .. code-block:: none Fitting to the training data: 0%| | 0/120 [00:00`_, which defines the decrease in a model score when a given feature is randomly permuted. So, the higher the score, the more does the model's predictions depend on this feature. **Note:** There are some pitfalls in interpreting these numbers, which can be found in the `scikit-learn docs `_. .. GENERATED FROM PYTHON SOURCE LINES 60-78 .. code-block:: default r = permutation_importance(automl, X_test, y_test, n_repeats=10, random_state=0) sort_idx = r.importances_mean.argsort()[::-1] plt.boxplot( r.importances[sort_idx].T, labels=[dataset.feature_names[i] for i in sort_idx] ) plt.xticks(rotation=90) plt.tight_layout() plt.show() for i in sort_idx[::-1]: print( f"{dataset.feature_names[i]:10s}: {r.importances_mean[i]:.3f} +/- " f"{r.importances_std[i]:.3f}" ) .. image-sg:: /examples/40_advanced/images/sphx_glr_example_inspect_predictions_001.png :alt: example inspect predictions :srcset: /examples/40_advanced/images/sphx_glr_example_inspect_predictions_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none gyro_y : 0.000 +/- 0.002 gyro_x : 0.029 +/- 0.003 gyro_z : 0.040 +/- 0.003 acceleration_x: 0.058 +/- 0.007 acceleration_z: 0.131 +/- 0.006 acceleration_y: 0.276 +/- 0.006 .. GENERATED FROM PYTHON SOURCE LINES 79-94 Create partial dependence (PD) and individual conditional expectation (ICE) plots - part 2 ========================================================================================== `ICE plots `_ describe the relation between feature values and the response value for each sample individually -- it shows how the response value changes if the value of one feature is changed. `PD plots `_ describe the relation between feature values and the response value, i.e. the expected response value wrt. one or multiple input features. Since we use a classification dataset, this corresponds to the predicted class probability. Since ``acceleration_y`` and ``acceleration_z`` turned out to have the largest impact on the response value according to the permutation dependence, we'll first look at them and generate a plot combining ICE (thin lines) and PD (thick line) .. GENERATED FROM PYTHON SOURCE LINES 94-107 .. code-block:: default features = [1, 2] plot_partial_dependence( automl, dataset.data, features=features, grid_resolution=5, kind="both", feature_names=dataset.feature_names, ) plt.tight_layout() plt.show() .. image-sg:: /examples/40_advanced/images/sphx_glr_example_inspect_predictions_002.png :alt: example inspect predictions :srcset: /examples/40_advanced/images/sphx_glr_example_inspect_predictions_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 108-113 Create partial dependence (PDP) plots for more than one feature - part 3 ======================================================================== A PD plot can also be generated for two features and thus allow to inspect the interaction between these features. Again, we'll look at acceleration_y and acceleration_z. .. GENERATED FROM PYTHON SOURCE LINES 113-124 .. code-block:: default features = [[1, 2]] plot_partial_dependence( automl, dataset.data, features=features, grid_resolution=5, feature_names=dataset.feature_names, ) plt.tight_layout() plt.show() .. image-sg:: /examples/40_advanced/images/sphx_glr_example_inspect_predictions_003.png :alt: example inspect predictions :srcset: /examples/40_advanced/images/sphx_glr_example_inspect_predictions_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 4 minutes 15.369 seconds) .. _sphx_glr_download_examples_40_advanced_example_inspect_predictions.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/automl/auto-sklearn/development?urlpath=lab/tree/notebooks/examples/40_advanced/example_inspect_predictions.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: example_inspect_predictions.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: example_inspect_predictions.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_