.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/40_advanced/example_custom_configuration_space.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_40_advanced_example_custom_configuration_space.py: ====================== Tabular Classification with Custom Configuration Space ====================== The following example shows how adjust the configuration space of the search. Currently, there are two changes that can be made to the space:- 1. Adjust individual hyperparameters in the pipeline 2. Include or exclude components: a) include: Dictionary containing components to include. Key is the node name and Value is an Iterable of the names of the components to include. Only these components will be present in the search space. b) exclude: Dictionary containing components to exclude. Key is the node name and Value is an Iterable of the names of the components to exclude. All except these components will be present in the search space. .. GENERATED FROM PYTHON SOURCE LINES 20-61 .. code-block:: default import os import tempfile as tmp import warnings os.environ['JOBLIB_TEMP_FOLDER'] = tmp.gettempdir() os.environ['OMP_NUM_THREADS'] = '1' os.environ['OPENBLAS_NUM_THREADS'] = '1' os.environ['MKL_NUM_THREADS'] = '1' warnings.simplefilter(action='ignore', category=UserWarning) warnings.simplefilter(action='ignore', category=FutureWarning) import sklearn.datasets import sklearn.model_selection from autoPyTorch.api.tabular_classification import TabularClassificationTask from autoPyTorch.utils.hyperparameter_search_space_update import HyperparameterSearchSpaceUpdates def get_search_space_updates(): """ Search space updates to the task can be added using HyperparameterSearchSpaceUpdates Returns: HyperparameterSearchSpaceUpdates """ updates = HyperparameterSearchSpaceUpdates() updates.append(node_name="data_loader", hyperparameter="batch_size", value_range=[16, 512], default_value=32) updates.append(node_name="lr_scheduler", hyperparameter="CosineAnnealingLR:T_max", value_range=[50, 60], default_value=55) updates.append(node_name='network_backbone', hyperparameter='ResNetBackbone:dropout', value_range=[0, 0.5], default_value=0.2) return updates .. GENERATED FROM PYTHON SOURCE LINES 62-64 Data Loading ============ .. GENERATED FROM PYTHON SOURCE LINES 64-71 .. code-block:: default X, y = sklearn.datasets.fetch_openml(data_id=40981, return_X_y=True, as_frame=True) X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split( X, y, random_state=1, ) .. GENERATED FROM PYTHON SOURCE LINES 72-74 Build and fit a classifier with include components ================================================== .. GENERATED FROM PYTHON SOURCE LINES 74-80 .. code-block:: default api = TabularClassificationTask( search_space_updates=get_search_space_updates(), include_components={'network_backbone': ['MLPBackbone', 'ResNetBackbone'], 'encoder': ['OneHotEncoder']} ) .. GENERATED FROM PYTHON SOURCE LINES 81-83 Search for an ensemble of machine learning algorithms ===================================================== .. GENERATED FROM PYTHON SOURCE LINES 83-93 .. code-block:: default api.search( X_train=X_train.copy(), y_train=y_train.copy(), X_test=X_test.copy(), y_test=y_test.copy(), optimize_metric='accuracy', total_walltime_limit=150, func_eval_time_limit_secs=30 ) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 94-96 Print the final ensemble performance ==================================== .. GENERATED FROM PYTHON SOURCE LINES 96-104 .. code-block:: default y_pred = api.predict(X_test) score = api.score(y_pred, y_test) print(score) print(api.show_models()) # Print statistics from search print(api.sprint_statistics()) .. rst-class:: sphx-glr-script-out .. code-block:: none {'accuracy': 0.8497109826589595} | | Preprocessing | Estimator | Weight | |---:|:-------------------------------------------------------------------------------------------------|:----------------------------------------------------------|---------:| | 0 | None | LGBMLearner | 0.24 | | 1 | SimpleImputer,Variance Threshold,NoCoalescer,OneHotEncoder,NoScaler,NoFeaturePreprocessing | no embedding,MLPBackbone,FullyConnectedHead,nn.Sequential | 0.18 | | 2 | None | RFLearner | 0.16 | | 3 | None | ETLearner | 0.12 | | 4 | SimpleImputer,Variance Threshold,NoCoalescer,OneHotEncoder,StandardScaler,NoFeaturePreprocessing | no embedding,MLPBackbone,FullyConnectedHead,nn.Sequential | 0.1 | | 5 | None | KNNLearner | 0.06 | | 6 | SimpleImputer,Variance Threshold,NoCoalescer,OneHotEncoder,StandardScaler,LinearSVC Preprocessor | no embedding,MLPBackbone,FullyConnectedHead,nn.Sequential | 0.04 | | 7 | SimpleImputer,Variance Threshold,NoCoalescer,OneHotEncoder,StandardScaler,NoFeaturePreprocessing | no embedding,MLPBackbone,FullyConnectedHead,nn.Sequential | 0.04 | | 8 | SimpleImputer,Variance Threshold,MinorityCoalescer,OneHotEncoder,NoScaler,PCA | no embedding,MLPBackbone,FullyConnectedHead,nn.Sequential | 0.02 | | 9 | None | SVMLearner | 0.02 | | 10 | SimpleImputer,Variance Threshold,NoCoalescer,OneHotEncoder,StandardScaler,NoFeaturePreprocessing | no embedding,MLPBackbone,FullyConnectedHead,nn.Sequential | 0.02 | autoPyTorch results: Dataset name: 8872a53e-22f5-11ed-8835-b1fa420cf160 Optimisation Metric: accuracy Best validation score: 0.8596491228070176 Number of target algorithm runs: 18 Number of successful target algorithm runs: 14 Number of crashed target algorithm runs: 3 Number of target algorithms that exceeded the time limit: 1 Number of target algorithms that exceeded the memory limit: 0 .. GENERATED FROM PYTHON SOURCE LINES 105-107 Build and fit a classifier with exclude components ================================================== .. GENERATED FROM PYTHON SOURCE LINES 107-113 .. code-block:: default api = TabularClassificationTask( search_space_updates=get_search_space_updates(), exclude_components={'network_backbone': ['MLPBackbone'], 'encoder': ['OneHotEncoder']} ) .. GENERATED FROM PYTHON SOURCE LINES 114-116 Search for an ensemble of machine learning algorithms ===================================================== .. GENERATED FROM PYTHON SOURCE LINES 116-126 .. code-block:: default api.search( X_train=X_train, y_train=y_train, X_test=X_test.copy(), y_test=y_test.copy(), optimize_metric='accuracy', total_walltime_limit=150, func_eval_time_limit_secs=30 ) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 127-129 Print the final ensemble performance ==================================== .. GENERATED FROM PYTHON SOURCE LINES 129-136 .. code-block:: default y_pred = api.predict(X_test) score = api.score(y_pred, y_test) print(score) print(api.show_models()) # Print statistics from search print(api.sprint_statistics()) .. rst-class:: sphx-glr-script-out .. code-block:: none {'accuracy': 0.8728323699421965} | | Preprocessing | Estimator | Weight | |---:|:---------------------------------------------------------------------------------------------|:----------------------------------------------------------------|---------:| | 0 | None | LGBMLearner | 0.36 | | 1 | None | RFLearner | 0.26 | | 2 | None | ETLearner | 0.14 | | 3 | SimpleImputer,Variance Threshold,NoCoalescer,NoEncoder,StandardScaler,NoFeaturePreprocessing | no embedding,ShapedMLPBackbone,FullyConnectedHead,nn.Sequential | 0.1 | | 4 | None | SVMLearner | 0.08 | | 5 | SimpleImputer,Variance Threshold,NoCoalescer,NoEncoder,Normalizer,KernelPCA | no embedding,ShapedMLPBackbone,FullyConnectedHead,nn.Sequential | 0.02 | | 6 | None | KNNLearner | 0.02 | | 7 | SimpleImputer,Variance Threshold,NoCoalescer,NoEncoder,StandardScaler,NoFeaturePreprocessing | no embedding,ShapedMLPBackbone,FullyConnectedHead,nn.Sequential | 0.02 | autoPyTorch results: Dataset name: f2793923-22f5-11ed-8835-b1fa420cf160 Optimisation Metric: accuracy Best validation score: 0.8596491228070176 Number of target algorithm runs: 20 Number of successful target algorithm runs: 14 Number of crashed target algorithm runs: 5 Number of target algorithms that exceeded the time limit: 1 Number of target algorithms that exceeded the memory limit: 0 .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 5 minutes 52.526 seconds) .. _sphx_glr_download_examples_40_advanced_example_custom_configuration_space.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/automl/Auto-PyTorch/development?urlpath=lab/tree/notebooks/examples/40_advanced/example_custom_configuration_space.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: example_custom_configuration_space.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: example_custom_configuration_space.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_