ADD #900: Make data preprocessing more configurable, for example allow to completely disable it.
ADD #1128: Adds new functionality to retrieve data for an accuracy over time plot from Auto-sklearn without additional code.
FIX #1149: Stops Auto-sklearn from printing weird warnings (Exception ignored in […]) at shutdown.
FIX #1169: Fixes a bug which made cross-validation and multi-output regression incompatible.
FIX #1170: Make all preprocessing techniques deterministic.
FIX #1190: Fixes a bug which could make predictive probabilities contain too few classes in case one class was only present a single time.
FIX #1209: Pass random states to pipeline objects.
FIX #1204: Add support for sparse data in Auto-sklearn 2.0.
FIX #1210: Add support for sparse y labels.
FIX #1245: Fixes a bug which could result in Auto-sklearn crashing in case a class was present only once.
DOC #532,#1242: Simplify installation instructions.
DOC #1144: Document installation via conda
DOC #1195,#1201,#1214: Fix a few typos and links. Make some http links https links.
DOC #1200: Fixes variable name in an example.
DOC #1229: Improve code formatting in the documentation.
DOC #1235: Improve docker startup command so it also work on Windows.
MAINT #1198: Use latest Ubuntu LTS (20:04) for github actions.
MAINT #1231: The command make linkcheck no longer builds the documentation, speeding up link-checking.
MAINT #1233: Enable regression testing with 3 classification and 3 regression datasets on github actions.
MAINT #1239: Increase the timeout for github actions to 60 minutes.
Francisco Rivera Valverde
Eddie Bergman <email@example.com>
ADD #1100: Provide access to the callbacks of SMAC.
ADD #1185: New leaderboard functionality to visualize models
FIX #1133: Refer to the correct attribute in an error message.
FIX #1154: Allow running Auto-sklearn on a 32-bit system.
MAINT #924: Instead of passing classes for the resampling strategy one has now to pass objects.
MAINT #1108: Limit the number of threads used by numpy and/or scikit-learn via threadpoolctl.
MAINT #1135: Simplify internal workflow of pandas handling. This results in pandas being passed directly passed to scikit-learn models instead of being internally converted into a numpy array. However, this should neither impact the behavior nor the performance of Auto-sklearn.
MAINT #1157: Drop support for Python 3.6, enable support for Python 3.9.
MAINT #1159: Remove the output directory argument to the classifier and regressor. Despite the name, the output directory was not used and was a leftover from participating in the AutoML challenges.
MAINT #1187: Bump requires SMAC version to at least 0.14.
DOC #1109: Add an FAQ.
DOC #1126: Add new examples on how to use scikit-learn’s inspect module.
DOC #1136: Add a new example on how to perform multi-output regression.
DOC #1152: Enable link checking when building the documentation.
DOC #1158: New example on how to configure the logger for Auto-sklearn.
DOC #1165: Improve the readme page.
Francisco Rivera Valverde
MAINT #1183: Introduce an upper bound on the dask version to retain compatibility with SMAC3.
ADD #1178: Reduce precision if dataset is too large for given memory limit.
ADD #1179: Improve Auto-sklearn 2.0 meta-data by providing new meta-data for the metrics roc_auc and logloss.
DOC: Fix reference to arXiv paper
MAINT #1134,#1142,#1143: Improvements to the stale bot - the stale bot now marks issues labeled with feedback required as stale if there is nothing happening for 30 days. After another 7 days it then closes the issue.
MAINT: Added a new issue template for questions.
MAINT #1168: Upper-bound scipy to 1.6.3 as 1.7.0 is incompatible with SMAC.
MAINT #1173: Update the license files to be recognized by github.
Francisco Rivera Valverde
ADD #886: Provide new function which allows fitting only a single configuration.
DOC #1070: Clarify example on how successive halving and Bayesian optimization play together.
DOC #1112: Fix type.
DOC #1122: Add Python 3 to the installation command for Ubuntu.
FIX #1114: Fix a bug which made printing dummy models fail.
FIX #1117: Fix a bug previously made memory_limit=None fail.
FIX #1121: Fix an edge case which could decrease performance in Auto-sklearn 2.0 when using cross-validation with iterative fitting.
FIX #1123: Fix a bug autosklearn.metrics.calculate_score for metrics/scores which need to be minimized where the function previously returned the loss and not the score.
FIX #1115/#1124: Fix a bug which would prevent Auto-sklearn from computing meta-features in the multiprocessing case.
Francisco Rivera Valverde
Lucas Nildaimon dos Santos Silva
numpyas installation requirements.
ADD #660: Enable scikit-learn’s power transformation for input features.
MAINT: Bump the
pyrfrminimum dependency to 0.8.1 to automatically download wheels from pypi if possible.
FIX #732: Add a missing size check into the GMEANS clustering used for the NeurIPS 2015 paper.
FIX #1050: Add missing arguments to the
FIX #1072: Fixes a bug where the
AutoSklearn2Classifiercould not be created due to trying to cache to the wrong directory.
FIX #1061: Fixes a bug where the model could not be printed in a jupyter notebook.
FIX #1075: Fixes a bug where the ensemble builder would wrongly prune good models for loss functions (i.e. functions that need to be minimized such as
FIX #1079: Fixes a bug where
AutoMLRegressor.cv_resultscould rank results in opposite order for loss functions (i.e. functions that need to be minimized such as
FIX: Fixes a bug in offline meta-data generation that could lead to a deadlock.
MAINT #1076: Uses the correct multiprocessing context for computing meta-features
MAINT: Cleanup readme and main directory
ADD #1045: New example demonstrating how to log multiple metrics during a run of Auto-sklearn.
DOC #1052: Add links to mybinder
DOC #1059: Improved the example on manually starting workers for Auto-sklearn.
FIX #1046: Add the final result of the ensemble builder to the ensemble builder trajectory.
MAINT: Two log outputs of level warning about metadata were turned reduced to the info loglevel as they are not actionable for the user.
MAINT #1062: Use threads for local dask workers and forkserver to start subprocesses to reduce overhead.
MAINT #1053: Remove the restriction to guard single-core Auto-sklearn by
__main__ == "__name__"again.
ADD: A new heuristic which gives a warning and subsamples the data if it is too large for the given
ADD #1024: Tune scikit-learn’s
MAINT #1017: Improve the logging server introduced in release 0.12.0.
MAINT #1024: Move to scikit-learn 0.24.X.
MAINT #1038: Use new datasets for regression and classification and also update the metadata used for Auto-sklearn 1.0.
MAINT #1040: Minor speed improvements in the ensemble selection algorithm.
BREAKING: Auto-sklearn must now be guarded by
__name__ == "__main__"due to the use of the
ADD #1026: Adds improved meta-data for Auto-sklearn 2.0 which results in strong improved performance.
MAINT #984 and #1008: Move to scikit-learn 0.23.X
MAINT #1004: Move from travis-ci to github actions.
MAINT 8b67af6: drop the requirement to the lockfile package.
FIX #990: Fixes a bug that made Auto-sklearn fail if there are missing values in a pandas DataFrame.
FIX #1007, #1012 and #1014: Log multiprocessing output via a new log server. Remove several potential deadlocks related to the joint use of multi-processing, multi-threading and logging.
FIX #989: Fixes a bug where y was not passed to all data preprocessors which made 3rd party category encoders fail.
FIX #1001: Fixes a bug which could make Auto-sklearn fail at random.
MAINT #1000: Introduce a minimal version for
ADD #992: Move ensemble building from being a separate process to a job submitted to the dask cluster. This allows for better control of the memory used in multiprocessing settings.
FIX #905: Make
FIX #970: Fix a bug where Auto-sklearn would fail if categorical features are passed as a Pandas Dataframe.
MAINT #772: Improve error message in case of dummy prediction failure.
MAINT #948: Finally use Pandas >= 1.0.
MAINT #973: Improve meta-data by running meta-data generation for more time and separately for important metrics.
MAINT #997: Improve memory handling in the ensemble building process. This allows building ensembles for larger datasets.
ADD #325: Allow to separately optimize metrics for metadata generation.
ADD #946: New dask backend for parallel Auto-sklearn.
BREAKING #947: Drop Python3.5 support.
BREAKING #946: Remove shared model mode for parallel Auto-sklearn.
FIX #351: No longer pass un-picklable logger instances to the target function.
FIX #840: Fixes a bug which prevented computing metadata for regression datasets. Also adds a unit test for regression metadata computation.
FIX #897: Allow custom splitters to be used with multi-ouput regression.
FIX #951: Fixes a lot of bugs in the regression pipeline that caused bad performance for regression datasets.
FIX #953: Re-add liac-arff as a dependency.
FIX #956: Fixes a bug which could cause Auto-sklearn not to find a model on disk which is part of the ensemble.
FIX #961: Fixes a bug which caused Auto-sklearn to load bad meta-data for metrics which cannot be computed on multiclass datasets (especially ROC_AUC).
DOC #498: Improve the example on resampling strategies by showing how to pass scikit-learn’s splitter objects to Auto-sklearn.
DOC #670: Demonstrate how to give access to training accuracy.
DOC #872: Improve an example on how obtain the best model.
DOC #940: Improve documentation of the docker image.
MAINT: Improve the docker file by setting environment variable that restrict BLAS and OMP to only use a single core.
MAINT #949: Replace pip by pip3 in the installation guidelines.
MAINT #280, #535, #956: Update meta-data and include regression meta-data again.
ADD #157,#889: Improve handling of pandas dataframes, including the possibility to use pandas’ categorical column type.
ADD #375: New SelectRates feature preprocessing component for regression.
ADD #891: Improve the robustness of Auto-sklearn by using the single best model if no ensemble is found.
ADD #902: Track performance of the ensemble over time.
ADD #914: Add an example on using pandas dataframes as input to Auto-sklearn.
ADD #919: Add an example for multilabel classification.
MAINT #909: Fix broken links in the documentation.
MAINT #907,#911: Add initial support for mypy.
MAINT #881,#927: Automatically build docker images on pushes to the master and development branch and also push them to dockerhub and the github docker registry.
MAINT #918: Remove old dependencies from requirements.txt.
MAINT #931: Add information about the host system and installed packages to the log file.
MAINT #933: Reduce the number of warnings raised when building the documentation by sphinx.
MAINT #936: Completely restructure the examples section.
FIX #558: Provide better error message when the ensemble process fails due to a memory issue.
FIX #901: Allow custom resampling strategies again (was broken due to an upgrade of SMAC).
FIX #916: Fixes a bug where the data preprocessing configurations were ignored.
FIX #925: make internal data preprocessing objects clonable.
ADD #803: multi-output regression
ADD #893: new Auto-sklearn mode Auto-sklearn 2.0
ADD #764: support for automatic per_run_time_limit selection
ADD #864: add the possibility to predict with cross-validation
ADD #874: support to limit the disk space consumption
MAINT #862: improved documentation and render examples in web page
MAINT #869: removal of competition data manager support
MAINT #870: memory improvements when building ensemble
MAINT #882: memory improvements when performing ensemble selection
FIX #701: scaling factors for metafeatures should not be learned using test data
FIX #715: allow unlimited ML memory
FIX #771: improved worst possible result calculation
FIX #843: default value for SelectPercentileRegression
FIX #852: clip probabilities within [0-1]
FIX #854: improved tmp file naming
FIX #863: SMAC exceptions also registered in log file
FIX #876: allow Auto-sklearn model to be cloned
FIX #879: allow 1-D binary predictions
ADD #785: user control to reduce the hard drive memory required to store ensembles
ADD #794: iterative fit for gradient boosting
ADD #795: add successive halving evaluation strategy
ADD #814: new sklearn.metrics.balanced_accuracy_score instead of custom metric
ADD #815: new experimental evaluation mode called iterative_cv
MAINT #774: move from scikit-learn 0.21.X to 0.22.X
MAINT #791: move from smac 0.8 to 0.12
MAINT #822: make autosklearn modules PEP8 compliant
FIX #733: fix for n_jobs=-1
FIX #739: remove unnecessary warning
FIX ##769: fixed error in calculation of meta features
FIX #778: support for python 3.8
FIX #781: support for pandas 1.x
MAINT: move from scikit-learn 0.19.X to 0.21.X
MAINT #688: allow for pyrfr version 0.8.X
FIX #680: Remove unnecessary print statement
FIX #600: Remove unnecessary warning
Jin Woo Ahn
FIX #669: Correctly handle arguments to the
FIX #667: Auto-sklearn works with numpy 1.16.3 again.
ADD #676: Allow brackets [ ] inside the temporary and output directory paths.
ADD #424: (Experimental) scripts to reproduce the results from the original Auto-sklearn paper.
Jin Woo Ahn
ADD #650: Auto-sklearn will immediately stop if prediction using scikit-learn’s dummy predictor fail.
ADD #537: Auto-sklearn will no longer start for time limits less than 30 seconds.
FIX #655: Fixes an issue where predictions using models from parallel Auto-sklearn runs could be wrong.
FIX #648: Fixes an issue with custom meta-data directories.
FIX #626: Fixes an issue where losses were not minimized, but maximized.
MAINT #646: Do no longer restrict the numpy version to be less than 1.14.5.
Jin Woo Ahn
ADD #593: Auto-sklearn supports the
n_jobsargument for parallel computing on a single machine.
DOC #618: Added links to several system requirements.
Fixes #611: Improved installation from pip.
TEST #614: Test installation with clean Ubuntu on travis-ci.
MAINT: Fixed broken link and typo in the documentation.
Pradeep Reddy Raamana
Fixes #538: Remove rounding errors when giving a training set fraction for holdout.
Fixes #558: Ensemble script now uses less memory and the memory limit can be given to Auto-sklearn.
Fixes #585: Auto-sklearn’s ensemble script produced wrong results when called directly (and not via one of Auto-sklearn’s estimator classes).
Fixes an error in the ensemble script which made it non-deterministic.
MAINT #569: Rename hyperparameter to have a different name than a scikit-learn hyperparameter with different meaning.
MAINT #592: backwards compatible requirements.txt
MAINT #588: Fix SMAC version to 0.8.0
MAINT: remove dependency on the six package
MAINT: upgrade to XGBoost 0.80
Jin Woo Ahn
Added documentation on how to extend Auto-sklearn with custom classifier, regressor, and preprocessor.
Auto-sklearn now requires numpy version between 1.9.0 and 1.14.5, due to higher versions causing travis failure.
Examples now use
sklearn.datasets.load_digits()to reduce memory usage for travis build.
Fixes future warnings on non-tuple sequence for indexing.
Fixes #566: ensembles are now sorted correctly.
Fixes #293: Auto-sklearn checks if appropriate target type was given for classification and regression before call to
Travis-ci now runs flake8 to enforce pep8 style guide, and uses travis-ci instead of circle-ci for deployment.
Jin Woo Ahn
Fixes #409: fixes
predict_probato no longer raise an AttributeError.
Improved documentation of the parallel example.
Classifiers are now tested to be idempotent as required by scikit-learn.
Fixes the usage of the shrinkage parameter in LDA.
Fixes #410 and changes the SGD hyperparameters
Fixes #425 which caused the non-linear support vector machine to always crash on OSX.
Implements #149: it is now possible to pass a custom cross-validation split following scikit-learn’s
It is now possible to decide whether or not to shuffle the data in Auto-sklearn by passing a bool shuffle in the dictionary of
Added functionality to track the test performance over time.
Re-factored the ensemble building to be faster, read less data from the hard drive and perform random tie breaking in case of equally well-performing models.
Implements #438: To be consistent with the output of SMAC (which minimizes the loss of a target function), the output of the ensemble builder is now also the output of a minimization problem.
Implements #271: XGBoost is available again, even configuring the new dropout functionality.
New documentation section Inspecting the results.
Fixes #444: Auto-sklearn now only loads models for refit which are actually relevant for the ensemble.
Adds an operating system check at import and installation time to make sure to not accidentaly run on a Windows machine.
New examples gallery using sphinx gallery: Examples
Safeguard Auto-sklearn against deleting directories it did not create (Issue #317.
Jesper van Engelen
Jin Woo Ahn
Upgrade to scikit-learn 0.19.1.
Do not use the
DummyRegressoras part of an ensemble. Fixes #140.
Fixes #295 by loading the data in the subprocess instead of the main process.
Fixes #326: refitting could result in a type error. This is now fixed by better type checking in the classification components.
Updated search space for
Removal of constant features is now a part of the pipeline.
Allow passing an SMBO object into the
Jesper van Engelen
Allows the usage of scikit-learn 0.18.2.
Upgrade to latest SMAC version (
0.6.0) and latest random forest version (
Added a Dockerfile.
Added the possibility to change the size of the holdout set when using holdout resampling strategy.
Fixed a bug in QDA’s hyperparameters.
Typo fixes in print statements.
New method to retrieve the models used in the final ensemble.
Young Ryul Bae
auto-sklearn supports custom metrics and all metrics included in scikit-learn. Different metrics can now be passed to the
fit()-method estimator objects, for example
Upgrade to scikit-learn 0.18.1.
Drop XGBoost as the latest release (0.6a2) does not work when spawned by the pyninsher.
auto-sklearn can use multiprocessing in calls to
predict_proba. By Laurent Sorber.
There are no release notes for auto-sklearn prior to version 0.2.0.
Jost Tobias Springenberg
Timothy J Laurent