ML-Ensemble
author: | Sebastian Flennerhag |
---|---|
copyright: | 2017-2018 |
licence: | MIT |
Graph handles for deep computational graphs and ready-made ensemble classes for ensemble networks. Ready-made classes are full Scikit-learn estimators and can be used in conjunction with any other standard estimator.
mlens.ensemble¶
Base ensemble classes¶
Sequential¶
-
class
mlens.ensemble.
Sequential
(name=None, verbose=False, stack=None, **kwargs)[source]¶ Bases:
mlens.parallel.base.BaseStacker
Container class for a stack of sequentially processed estimators.
The Sequential class stories all layers as an ordered dictionary and modifies possesses a
get_params
method to appear as an estimator in the Scikit-learn API. This allows correct cloning and parameter updating.Parameters: - stack (list, optional (default = None)) – list of estimators (i.e. layers) to build instance with.
- n_jobs (int (default = -1)) – Degree of concurrency. Set
n_jobs = -1
for maximal parallelism andn_jobs=1
for sequential processing. - backend (str, (default="threading")) – the joblib backend to use (i.e. “multiprocessing” or “threading”).
- raise_on_exception (bool (default = False)) – raise error on soft exceptions. Otherwise issue warning.
- verbose (int or bool (default = False)) –
level of verbosity.
verbose = 0
silent (same asverbose = False
)verbose = 1
messages at start and finish (same asverbose = True
)verbose = 2
messages for each layer- etc
If
verbose >= 10
prints tosys.stderr
, elsesys.stdout
.
-
data
¶ Ensemble data
-
fit
(X, y=None, **kwargs)[source]¶ Fit instance.
Iterative fits each layer in the stack on the output of the subsequent layer. First layer is fitted on input data.
Parameters: - X (array-like of shape = [n_samples, n_features]) – input matrix to be used for fitting and predicting.
- y (array-like of shape = [n_samples, ]) – training labels.
- **kwargs (optional) – optional arguments to processor
-
fit_transform
(X, y=None, **kwargs)[source]¶ Fit instance and return cross-validated predictions.
Equivalent to
Sequential().fit(X, y, return_preds=True)
Parameters: - X (array-like of shape = [n_samples, n_features]) – input matrix to be used for fitting and predicting.
- y (array-like of shape = [n_samples, ]) – training labels.
- **kwargs (optional) – optional arguments to processor
-
predict
(X, **kwargs)[source]¶ Predict.
Parameters: - X (array-like of shape = [n_samples, n_features]) – input matrix to be used for prediction.
- **kwargs (optional) – optional keyword arguments.
Returns: X_pred – predictions from final layer.
Return type: array-like of shape = [n_samples, n_fitted_estimators]
-
transform
(X, **kwargs)[source]¶ Predict using sub-learners as is done during the
fit
call.Parameters: - X (array-like of shape = [n_samples, n_features]) – input matrix to be used for prediction.
- *args (optional) – optional arguments.
- **kwargs (optional) – optional keyword arguments.
Returns: X_pred – predictions from
fit
call to final layer.Return type: array-like of shape = [n_test_samples, n_fitted_estimators]
BaseEnsemble¶
-
class
mlens.ensemble.
BaseEnsemble
(shuffle=False, random_state=None, scorer=None, verbose=False, layers=None, array_check=None, model_selection=False, sample_size=20, **kwargs)[source]¶ Bases:
mlens.externals.sklearn.base.BaseEstimator
BaseEnsemble class.
Core ensemble class methods used to add ensemble layers and manipulate parameters.
Parameters: - model_selection (bool (default=False)) – Whether to use the ensemble in model selection mode. If
True
, this will alter thetransform
method. When callingtransform
on new data, the ensemble will callpredict
, while callingtransform
with the training data reproduces predictions from thefit
call. Hence the ensemble can be used as a pure transformer in a preprocessing pipeline passed to theEvaluator
, as training folds are faithfully reproduced as during afit``call and test folds are transformed with the ``predict
method. - samples_size (int (default=20)) – size of training set sample
(
[min(sample_size, X.size[0]), min(X.size[1], sample_size)]
- shuffle (bool (default=False)) – whether to shuffle input data during fit calls
- random_state (bool (default=False)) – random seed.
- scorer (obj, optional) – scorer function
- verbose (bool, optional) – verbosity
- samples_size – size of training set sample
(
[min(sample_size, X.size[0]), min(X.size[1], sample_size)]
-
add
(estimators, indexer, preprocessing=None, **kwargs)[source]¶ Method for adding a layer.
Parameters: - estimators (dict of lists or list of estimators, or :class:`Layer.) –
Pre-made layer or estimators to construct layer with. If
preprocessing
isNone
orlist
,estimators
should be alist
. The list can either contain estimator instances, named tuples of estimator instances, or a combination of both.option_1 = [estimator_1, estimator_2] option_2 = [("est-1", estimator_1), ("est-2", estimator_2)] option_3 = [estimator_1, ("est-2", estimator_2)]
If different preprocessing pipelines are desired, a dictionary that maps estimators to preprocessing pipelines must be passed. The names of the estimator dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2]. "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b]. "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - indexer (instance or None (default = None)) – Indexer instance to use. Defaults to the layer class
indexer with default settings. See
mlens.base
for details. - preprocessing (dict of lists or list, optional (default = None)) –
preprocessing pipelines for given layer. If the same preprocessing applies to all estimators,
preprocessing
should be a list of transformer instances. The list can contain the instances directly, named tuples of transformers, or a combination of both.option_1 = [transformer_1, transformer_2] option_2 = [("trans-1", transformer_1), ("trans-2", transformer_2)] option_3 = [transformer_1, ("trans-2", transformer_2)]
If different preprocessing pipelines are desired, a dictionary that maps preprocessing pipelines must be passed. The names of the preprocessing dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2]. "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b]. "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - **kwargs (optional) – keyword arguments to be passed onto the layer at instantiation.
Returns: self – Modified instance.
Return type: instance
- estimators (dict of lists or list of estimators, or :class:`Layer.) –
-
data
¶ Fit data
-
fit
(X, y=None, **kwargs)[source]¶ Fit ensemble.
Parameters: - X (array-like of shape = [n_samples, n_features]) – input matrix to be used for prediction.
- y (array-like of shape = [n_samples, ] or None (default = None)) – output vector to trained estimators on.
Returns: self – class instance with fitted estimators.
Return type: instance
-
fit_transform
(X, y, **kwargs)[source]¶ Fit ensemble and return cross-validated predictions.
Equivalent to
ensemble.fit(X, y).transform(X)
, but more efficient.Parameters: - X (array-like of shape = [n_samples, n_features]) – input matrix to be used for fitting and predicting.
- y (array-like of shape = [n_samples, ]) – training labels.
- **kwargs (optional) – optional arguments to processor
Returns: pred – predictions for provided input array. If in model selection mode, return a tuple
(X_trans, y_trans)
wherey_trans
is eithery
, or a trunctated version to match the samples inX_trans
.Return type: array-like or tuple, shape=[n_samples, n_features]
-
model_selection
¶ Turn model selection mode
-
predict
(X, **kwargs)[source]¶ Predict with fitted ensemble.
Parameters: X (array-like, shape=[n_samples, n_features]) – input matrix to be used for prediction. Returns: pred – predictions for provided input array. Return type: array-like or tuple, shape=[n_samples, n_features]
-
predict_proba
(X, **kwargs)[source]¶ Predict class probabilities with fitted ensemble.
Compatibility method for Scikit-learn. This method checks that the final layer has
proba=True
, then calls the regularpredict
method.Parameters: X (array-like, shape=[n_samples, n_features]) – input matrix to be used for prediction. Returns: pred – predictions for provided input array. Return type: array-like or tuple, shape=[n_samples, n_features]
-
remove
(idx)[source]¶ Remove a layer from stack
Remove a layer at a given position from stack.
Parameters: idx (int) – Position in stack. Indexing is 0-based. Returns: self – Modified instance Return type: instance
-
replace
(idx, estimators, indexer, preprocessing=None, **kwargs)[source]¶ Replace a layer.
Replace a layer in the stack with a new layer. See
add()
for full parameter documentation.Parameters: - idx (int) – Position in stack of layer to replace. Indexing is 0-based.
- estimators (dict of lists or list of estimators, or :class:`Layer.) – Pre-made layer or estimators to construct layer with.
- indexer (instance or None (default = None)) – Indexer instance to use. Defaults to the layer class
indexer with default settings. See
mlens.base
for details. - preprocessing (dict of lists or list, optional (default = None)) – preprocessing pipelines for given layer.
- **kwargs (optional) – keyword arguments to be passed onto the layer at instantiation.
Returns: self – Modified instance
Return type: instance
-
transform
(X, y=None, **kwargs)[source]¶ Transform with fitted ensemble.
Replicates cross-validated prediction process from training.
Parameters: - X (array-like, shape=[n_samples, n_features]) – input matrix to be used for prediction.
- y (array-like, shape[n_samples, ]) – targets. Needs to be passed as input in model selection mode as some indexers will reduce the size of the input array (X) and y must be adjusted accordingly.
Returns: pred – predictions for provided input array. If in model selection mode, return a tuple
(X_trans, y_trans)
wherey_trans
is eithery
, or a trunctated version to match the samples inX_trans
.Return type: array-like or tuple, shape=[n_samples, n_features]
-
verbose
¶ Level of printed messages
- model_selection (bool (default=False)) – Whether to use the ensemble in model selection mode. If
Ready-made ensemble classes¶
SuperLearner¶
-
class
mlens.ensemble.
SuperLearner
(folds=2, shuffle=False, random_state=None, scorer=None, raise_on_exception=True, array_check=None, verbose=False, n_jobs=-1, backend='threading', model_selection=False, sample_size=20, layers=None)[source]¶ Bases:
mlens.ensemble.base.BaseEnsemble
Super Learner class.
The Super Learner (also known as the Stacking Ensemble)is an supervised ensemble algorithm that uses K-fold estimation to map a training set \((X, y)\) into a prediction set \((Z, y)\), where the predictions in \(Z\) are constructed using K-Fold splits of \(X\) to ensure \(Z\) reflects test errors, and that applies a user-specified meta learner to predict \(y\) from \(Z\). The algorithm in sudo code follows:
- Specify a library \(L\) of base learners
- Fit all base learners on \(X\) and store the fitted estimators.
- Split \(X\) into \(K\) folds, fit every learner in \(L\) on the training set and predict test set. Repeat until all folds have been predicted.
- Construct a matrix \(Z\) by stacking the predictions per fold.
- Fit the meta learner on \(Z\) and store the learner
The ensemble can be used for prediction by mapping a new test set \(T\) into a prediction set \(Z'\) using the learners fitted in (2), and then mapping \(Z'\) to \(y'\) using the fitted meta learner from (5).
The Super Learner does asymptotically as well as (up to a constant) an Oracle selector. For the theory behind the Super Learner, see [1] and [2] as well as references therein.
Stacking K-fold predictions to cover an entire training set is a time consuming method and can be prohibitively costly for large datasets. With large data, other ensembles that fits an ensemble on subsets can achieve similar performance at a fraction of the training time. However, when data is noisy or of high variance, the
SuperLearner
ensure all information is used during fitting.References
[1] van der Laan, Mark J.; Polley, Eric C.; and Hubbard, Alan E., “Super Learner” (July 2007). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 222. http://biostats.bepress.com/ucbbiostat/paper222 [2] Polley, Eric C. and van der Laan, Mark J., “Super Learner In Prediction” (May 2010). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 266. http://biostats.bepress.com/ucbbiostat/paper266 Notes
This implementation uses the agnostic meta learner approach, where the user supplies the meta learner to be used. For the original Super Learner algorithm (i.e. learn the best linear combination of the base learners), the user can specify a linear regression as the meta learner.
See also
Note
All parameters can be overriden in the
add
method unless otherwise specified. Notably, thebackend
andn_jobs
cannot be altered in theadd
method.Parameters: - folds (int (default = 2)) – number of folds to use during fitting. Note: this parameter can be
specified on a layer-specific basis in the
add
method. - shuffle (bool (default = False)) – whether to shuffle data before before processing each layer. This
parameter can be overridden in the
add
method if different test sizes is desired for each layer. - random_state (int (default = None)) – random seed for shuffling inputs. Note that the seed here is used to
generate a unique seed for each layer. Can be overridden in the
add
method. - scorer (object (default = None)) – scoring function. If a function is provided, base estimators will be
scored on the training set assembled for fitting the meta estimator.
Since those predictions are out-of-sample, the scores represent valid
test scores. The scorer should be a function that accepts an array of
true values and an array of predictions:
score = f(y_true, y_pred)
. - raise_on_exception (bool (default = True)) – whether to issue warnings on soft exceptions or raise error.
Examples include lack of layers, bad inputs, and failed fit of an
estimator in a layer. If set to
False
, warnings are issued instead but estimation continues unless exception is fatal. Note that this can result in unexpected behavior unless the exception is anticipated. - verbose (int or bool (default = False)) –
level of verbosity.
verbose = 0
silent (same asverbose = False
)verbose = 1
messages at start and finish (same asverbose = True
)verbose = 2
messages for each layer
If
verbose >= 50
prints tosys.stdout
, elsesys.stderr
. For verbosity in the layers themselves, usefit_params
. - n_jobs (int (default = -1)) – Degree of parallel processing. Set to -1 for maximum parallelism and
1 for sequential processing. Cannot be overriden in the
add
method. - backend (str or object (default = 'threading')) – backend infrastructure to use during call to
mlens.externals.joblib.Parallel
. See Joblib for further documentation. To set global backend, setmlens.config._BACKEND
. Cannot be overriden in theadd
method. - model_selection (bool (default=False)) – Whether to use the ensemble in model selection mode. If
True
, this will alter thetransform
method. When callingtransform
on new data, the ensemble will callpredict
, while callingtransform
with the training data reproduces predictions from thefit
call. Hence the ensemble can be used as a pure transformer in a preprocessing pipeline passed to theEvaluator
, as training folds are faithfully reproduced as during afit``call and test folds are transformed with the ``predict
method. - sample_size (int (default=20)) – size of training set sample
(
[min(sample_size, X.size[0]), min(X.size[1], sample_size)]
)
Examples
Instantiate ensembles with no preprocessing: use list of estimators
>>> from mlens.ensemble import SuperLearner >>> from mlens.metrics.metrics import rmse >>> from sklearn.datasets import load_boston >>> from sklearn.linear_model import Lasso >>> from sklearn.svm import SVR >>> >>> X, y = load_boston(True) >>> >>> ensemble = SuperLearner() >>> ensemble.add([SVR(), ('can name some or all est', Lasso())]) >>> ensemble.add_meta(SVR()) >>> >>> ensemble.fit(X, y) >>> preds = ensemble.predict(X) >>> rmse(y, preds) 6.955358...
Instantiate ensembles with different preprocessing pipelines through dicts.
>>> from mlens.ensemble import SuperLearner >>> from mlens.metrics.metrics import rmse >>> from sklearn.datasets import load_boston >>> from sklearn. preprocessing import MinMaxScaler, StandardScaler >>> from sklearn.linear_model import Lasso >>> from sklearn.svm import SVR >>> >>> X, y = load_boston(True) >>> >>> preprocessing_cases = {'mm': [MinMaxScaler()], ... 'sc': [StandardScaler()]} >>> >>> estimators_per_case = {'mm': [SVR()], ... 'sc': [('can name some or all ests', Lasso())]} >>> >>> ensemble = SuperLearner() >>> ensemble.add(estimators_per_case, preprocessing_cases).add(SVR(), meta=True) >>> >>> ensemble.fit(X, y) >>> preds = ensemble.predict(X) >>> rmse(y, preds) 7.841329...
-
add
(estimators, preprocessing=None, proba=False, meta=False, propagate_features=None, **kwargs)[source]¶ Add layer to ensemble.
Parameters: - estimators (dict of lists or list or instance) –
estimators constituting the layer. If preprocessing is none and the layer is meant to be the meta estimator, it is permissible to pass a single instantiated estimator. If
preprocessing
isNone
orlist
,estimators
should be alist
. The list can either contain estimator instances, named tuples of estimator instances, or a combination of both.option_1 = [estimator_1, estimator_2] option_2 = [("est-1", estimator_1), ("est-2", estimator_2)] option_3 = [estimator_1, ("est-2", estimator_2)]
If different preprocessing pipelines are desired, a dictionary that maps estimators to preprocessing pipelines must be passed. The names of the estimator dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - preprocessing (dict of lists or list, optional (default = None)) –
preprocessing pipelines for given layer. If the same preprocessing applies to all estimators,
preprocessing
should be a list of transformer instances. The list can contain the instances directly, named tuples of transformers, or a combination of both.option_1 = [transformer_1, transformer_2] option_2 = [("trans-1", transformer_1), ("trans-2", transformer_2)] option_3 = [transformer_1, ("trans-2", transformer_2)]
If different preprocessing pipelines are desired, a dictionary that maps preprocessing pipelines must be passed. The names of the preprocessing dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - proba (bool) – whether layer should predict class probabilities. Note: setting
proba=True
will attempt to call an the estimatorspredict_proba
method. - propagate_features (list, optional) – List of column indexes to propagate from the input of
the layer to the output of the layer. Propagated features are
concatenated and stored in the leftmost columns of the output
matrix. The
propagate_features
list should define a slice of the numpy array containing the input data, e.g.[0, 1]
to propagate the first two columns of the input matrix to the output matrix. - meta (bool (default = False)) – indicator if the layer added is the final meta estimator. This will prevent folded or blended fits of the estimators and only fit them once on the full input data.
- **kwargs (optional) – optional keyword arguments.
Returns: self – ensemble instance with layer instantiated.
Return type: instance
- estimators (dict of lists or list or instance) –
Subsemble¶
-
class
mlens.ensemble.
Subsemble
(partitions=2, partition_estimator=None, folds=2, shuffle=False, random_state=None, scorer=None, raise_on_exception=True, array_check=None, verbose=False, n_jobs=-1, backend=None, model_selection=False, sample_size=20, layers=None)[source]¶ Bases:
mlens.ensemble.base.BaseEnsemble
Subsemble class.
Subsemble is a supervised ensemble algorithm that uses subsets of the full data to fit a layer, and within each subset K-fold estimation to map a training set \((X, y)\) into a prediction set \((Z, y)\), where \(Z\) is a matrix of prediction from each estimator on each subset (thus of shape
[n_samples, (partitions * n_estimators)]
). \(Z\) is constructed using K-Fold splits of each partition of X to ensure \(Z\) reflects test errors within each partition. A final user-specified meta learner is fitted to the final ensemble layer’s prediction, to learn the best combination of subset-specific estimator predictions. By passing apartition_estimator
, the partitions can be learnt. The algorithm in sudo code :For each layer in the ensemble, do:
Specify a library of \(L\) base learners
Specify a partition strategy and partition \(X\) into \(J\) subsets.
For each partition do:
- Fit all base learners and store them
- Create \(K\) folds
- For each fold, do:
- Fit all base learners on the training folds
- Collect all test folds, across partitions, and predict.
Assemble a cross-validated prediction matrix \(Z \in \mathbb{R}^{(n \times (L \times J))}\) by stacking predictions made in the cross-validation step.
Fit the meta learner on \(Z\) and store the learner.
The ensemble can be used for prediction by mapping a new test set \(T\) into a prediction set \(Z'\) using the learners fitted in (1.3.1), and then using \(Z'\) to generate final predictions through the fitted meta learner from (2).
The Subsemble does asymptotically as well as (up to a constant) the Oracle selector. For the theory behind the Subsemble, see [3] and references therein.
By partitioning the data into subset and fitting on those, a Subsemble can reduce training time considerably if estimators does not scale linearly. Moreover, Subsemble allows estimators to learn different patterns from each subset, and so can improve the overall performance by achieving a tighter fit on each subset. Since all observations in the training set are predicted, no information is lost between layers.
This implementation allows very general partition estimators. The user must ensure that the partition estimator behaves as desired. To alter the expected behavior, see the
kwd
parameter under theadd
method and themlens.base.ClusteredSubsetIndex
. Also see the advanced tutorials for example use cases.References
[3] Sapp, S., van der Laan, M. J., & Canny, J. (2014). Subsemble: an ensemble method for combining subset-specific algorithm fits. Journal of Applied Statistics, 41(6), 1247-1259. http://doi.org/10.1080/02664763.2013.864263 See also
Note
All parameters can be overriden in the
add
method unless otherwise specified. Notably, thebackend
andn_jobs
cannot be altered in theadd
method.Parameters: - partitions (int (default = 2)) – number of partitions to split data into. For each layer,
increasing partitions increases the number of estimators in the
ensemble by a factor equal to the number of estimators.
Note: this parameter can be specified on a layer-specific basis in the
add
method. - partition_estimator (instance, optional) – To use a supervised or unsupervised estimator to learn partitions,
pass an instantiated estimator as
partition_estimator
. The estimator must accept afit
call for fitting the training data, and apredict
call that assigns cluster partitions labels. For instance, clustering estimator or classifiers (where their class predictions will be used for partitioning). The number of partitions by the estimator must correspond to thepartitions
argument. Specific estimators can be added to each layer by passing the estimator during the call to the ensemble’sadd
method. - folds (int (default = 2)) – number of folds to use during fitting. Note: this parameter can be
specified on a layer-specific basis in the
add
method. - shuffle (bool (default = False)) – whether to shuffle data before before processing each layer. This
parameter can be overridden in the
add
method if different test sizes is desired for each layer. - random_state (int (default = None)) – random seed for shuffling inputs. Note that the seed here is used to
generate a unique seed for each layer. Can be overridden in the
add
method. - scorer (object (default = None)) – scoring function. If a function is provided, base estimators will be
scored on the training set assembled for fitting the meta estimator.
Since those predictions are out-of-sample, the scores represent valid
test scores. The scorer should be a function that accepts an array of
true values and an array of predictions:
score = f(y_true, y_pred)
. - raise_on_exception (bool (default = True)) – whether to issue warnings on soft exceptions or raise error.
Examples include lack of layers, bad inputs, and failed fit of an
estimator in a layer. If set to
False
, warnings are issued instead but estimation continues unless exception is fatal. Note that this can result in unexpected behavior unless the exception is anticipated. - verbose (int or bool (default = False)) –
level of verbosity.
verbose = 0
silent (same asverbose = False
)verbose = 1
messages at start and finish (same asverbose = True
)verbose = 2
messages for each layer
If
verbose >= 50
prints tosys.stdout
, elsesys.stderr
. For verbosity in the layers themselves, usefit_params
. - n_jobs (int (default = -1)) –
Degree of concurrency in estimation. Set to -1 to maximize paralellization, while 1 runs on a single process (or thread equivalent). Cannot be overriden in the
add
method.Note
A high degree of partitioning can incur a thread overload that can in certain cases overwhelm OpenBLAS. If any of your estimators rely on OpenBLAS and you experience crashed, set
n_jobs
to a lower (i.e. -2). In these cases, this will actually not impact performance since the issues stems from having too many threads active, so lowering the count avoids the bottleneck. Reference: https://github.com/xianyi/OpenBLAS/issues/889 - backend (str or object (default = 'threading')) – backend infrastructure to use during call to
mlens.externals.joblib.Parallel
. See Joblib for further documentation. To set global backend, setmlens.config._BACKEND
. Cannot be overriden in theadd
method. - model_selection (bool (default=False)) – Whether to use the ensemble in model selection mode. If
True
, this will alter thetransform
method. When callingtransform
on new data, the ensemble will callpredict
, while callingtransform
with the training data reproduces predictions from thefit
call. Hence the ensemble can be used as a pure transformer in a preprocessing pipeline passed to theEvaluator
, as training folds are faithfully reproduced as during afit``call and test folds are transformed with the ``predict
method. - sample_size (int (default=20)) – size of training set sample
(
[min(sample_size, X.size[0]), min(X.size[1], sample_size)]
)
Examples
Instantiate ensembles with no preprocessing: use list of estimators
>>> from mlens.ensemble import Subsemble >>> from mlens.metrics.metrics import rmse >>> from sklearn.datasets import load_boston >>> from sklearn.linear_model import Lasso >>> from sklearn.svm import SVR >>> >>> X, y = load_boston(True) >>> >>> ensemble = Subsemble() >>> ensemble.add([SVR(), ('can name some or all est', Lasso())]) >>> ensemble.add(SVR(), meta=True) >>> >>> ensemble.fit(X, y) >>> preds = ensemble.predict(X) >>> rmse(y, preds) 9.2393246...
Instantiate ensembles with different preprocessing pipelines through dicts.
>>> from mlens.ensemble import Subsemble >>> from mlens.metrics.metrics import rmse >>> from sklearn.datasets import load_boston >>> from sklearn. preprocessing import MinMaxScaler, StandardScaler >>> from sklearn.linear_model import Lasso >>> from sklearn.svm import SVR >>> >>> X, y = load_boston(True) >>> >>> preprocessing_cases = {'mm': [MinMaxScaler()], ... 'sc': [StandardScaler()]} >>> >>> estimators_per_case = {'mm': [SVR()], ... 'sc': [('can name some or all ests', Lasso())]} >>> >>> ensemble = Subsemble() >>> ensemble.add(estimators_per_case, preprocessing_cases).add_meta(SVR()) >>> >>> ensemble.fit(X, y) >>> preds = ensemble.predict(X) >>> rmse(y, preds) 9.0115741...
-
add
(estimators, preprocessing=None, meta=False, partitions=None, partition_estimator=None, folds=None, proba=False, propagate_features=None, **kwargs)[source]¶ Add layer to ensemble.
Parameters: - preprocessing (dict of lists or list, optional (default = None)) –
preprocessing pipelines for given layer. If the same preprocessing applies to all estimators,
preprocessing
should be a list of transformer instances. The list can contain the instances directly, named tuples of transformers, or a combination of both.option_1 = [transformer_1, transformer_2] option_2 = [("trans-1", transformer_1), ("trans-2", transformer_2)] option_3 = [transformer_1, ("trans-2", transformer_2)]
If different preprocessing pipelines are desired, a dictionary that maps preprocessing pipelines must be passed. The names of the preprocessing dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - estimators (dict of lists or list or instance) –
estimators constituting the layer. If preprocessing is none and the layer is meant to be the meta estimator, it is permissible to pass a single instantiated estimator. If
preprocessing
isNone
orlist
,estimators
should be alist
. The list can either contain estimator instances, named tuples of estimator instances, or a combination of both.option_1 = [estimator_1, estimator_2] option_2 = [("est-1", estimator_1), ("est-2", estimator_2)] option_3 = [estimator_1, ("est-2", estimator_2)]
If different preprocessing pipelines are desired, a dictionary that maps estimators to preprocessing pipelines must be passed. The names of the estimator dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - meta (bool) – indicator if the layer added is the final meta estimator. This will prevent folded or blended fits of the estimators and only fit them once on the full input data.
- partitions (int, optional) – number of partitions to split data into. Increasing partitions increases the number of estimators in the layer by a factor equal to the number of estimators. Specifying this parameter overrides the ensemble-wide parameter.
- partition_estimator (instance, optional) – To use a supervised or unsupervised estimator to learn partitions,
pass an instantiated estimator as
partition_estimator
. The estimator must accept afit
call for fitting the training data, and apredict
call that assigns cluster partitions labels. For instance, clustering estimator or classifiers (where class predictions will be used for partitioning). The number of partitions by the estimator must correspond to the layer’spartitions
argument. Passing an estimator here supersedes any other estimator previously passed. - folds (int, optional) – Use if a different number of folds is desired than what the ensemble was instantiated with.
- proba (bool (default = False)) – whether to call
predict_proba
on base learners. - propagate_features (list, optional) – List of column indexes to propagate from the input of
the layer to the output of the layer. Propagated features are
concatenated and stored in the leftmost columns of the output
matrix. The
propagate_features
list should define a slice of the numpy array containing the input data, e.g.[0, 1]
to propagate the first two columns of the input matrix to the output matrix. - **kwargs (optional) –
optional keyword arguments to instantiate ensemble with. In particular, keywords for clustered subsemble learning
- fit_estimator (Bool, default = True) -
whether to call
fit
on the partition estimator. - attr (str, default = ‘predict’) - the method attribute to call for generating partition ids for the input data.
- partition_on (str, default = ‘X’) -
the input data for the
attr
method. One of'X'
,'y'
or'both'
.
- fit_estimator (Bool, default = True) -
whether to call
Returns: self – ensemble instance with layer instantiated.
Return type: instance
- preprocessing (dict of lists or list, optional (default = None)) –
BlendEnsemble¶
-
class
mlens.ensemble.
BlendEnsemble
(test_size=0.5, shuffle=False, random_state=None, scorer=None, raise_on_exception=True, array_check=None, verbose=False, n_jobs=-1, backend=None, model_selection=False, sample_size=20, layers=None)[source]¶ Bases:
mlens.ensemble.base.BaseEnsemble
Blend Ensemble class.
The Blend Ensemble is a supervised ensemble closely related to the
SuperLearner
. It differs in that to estimate the prediction matrix Z used by the meta learner, it uses a subset of the data to predict its complement, and the meta learner is fitted on those predictions.By only fitting every base learner once on a subset of the full training data,
BlendEnsemble
is a fast ensemble that can handle very large datasets simply by only using portion of it at each stage. The cost of this approach is that information is thrown out at each stage, as one layer will not see the training data used by the previous layer.With large data that can be expected to satisfy an i.i.d. assumption, the
BlendEnsemble
can achieve similar performance to more sophisticated ensembles at a fraction of the training time. However, with data data is not uniformly distributed or exhibits high variance theBlendEnsemble
can be a poor choice as information is lost at each stage of fitting.See also
Note
All parameters can be overriden in the
add
method unless otherwise specified. Notably, thebackend
andn_jobs
cannot be altered in theadd
method.Parameters: - test_size (int, float (default = 0.5)) – the size of the test set for each layer. This parameter can be
overridden in the
add
method if different test sizes is desired for each layer. If afloat
is specified, it is presumed to be the fraction of the available data to be used for training, and so0. < test_size < 1.
. - shuffle (bool (default = False)) – whether to shuffle data before before processing each layer. This
parameter can be overridden in the
add
method if different test sizes is desired for each layer. - random_state (int (default = None)) – random seed for shuffling inputs. Note that the seed here is used to
generate a unique seed for each layer. Can be overridden in the
add
method. - scorer (object (default = None)) – scoring function. If a function is provided, base estimators will be
scored on the prediction made. The scorer should be a function that
accepts an array of true values and an array of predictions:
score = f(y_true, y_pred)
. Can be overridden in theadd
method. - raise_on_exception (bool (default = True)) – whether to issue warnings on soft exceptions or raise error.
Examples include lack of layers, bad inputs, and failed fit of an
estimator in a layer. If set to
False
, warnings are issued instead but estimation continues unless exception is fatal. Note that this can result in unexpected behavior unless the exception is anticipated. - verbose (int or bool (default = False)) –
level of verbosity.
verbose = 0
silent (same asverbose = False
)verbose = 1
messages at start and finish (same asverbose = True
)verbose = 2
messages for each layer
If
verbose >= 50
prints tosys.stdout
, elsesys.stderr
. For verbosity in the layers themselves, usefit_params
. - n_jobs (int (default = -1)) – Degree of parallel processing. Set to -1 for maximum parallelism and
1 for sequential processing. Cannot be overriden in the
add
method. - backend (str or object (default = 'threading')) – backend infrastructure to use during call to
mlens.externals.joblib.Parallel
. See Joblib for further documentation. To set global backend, setmlens.config._BACKEND
. Cannot be overriden in theadd
method. - model_selection (bool (default=False)) – Whether to use the ensemble in model selection mode. If
True
, this will alter thetransform
method. When callingtransform
on new data, the ensemble will callpredict
, while callingtransform
with the training data reproduces predictions from thefit
call. Hence the ensemble can be used as a pure transformer in a preprocessing pipeline passed to theEvaluator
, as training folds are faithfully reproduced as during afit``call and test folds are transformed with the ``predict
method. - sample_size (int (default=20)) – size of training set sample
(
[min(sample_size, X.size[0]), min(X.size[1], sample_size)]
)
Examples
Instantiate ensembles with no preprocessing: use list of estimators
>>> from mlens.ensemble import BlendEnsemble >>> from mlens.metrics.metrics import rmse >>> from sklearn.datasets import load_boston >>> from sklearn.linear_model import Lasso >>> from sklearn.svm import SVR >>> >>> X, y = load_boston(True) >>> >>> ensemble = BlendEnsemble() >>> ensemble.add([SVR(), ('can name some or all est', Lasso())]) >>> ensemble.add_meta(SVR()) >>> >>> ensemble.fit(X, y) >>> preds = ensemble.predict(X) >>> rmse(y, preds) 7.3337...
Instantiate ensembles with different preprocessing pipelines through dicts.
>>> from mlens.ensemble import BlendEnsemble >>> from mlens.metrics.metrics import rmse >>> from sklearn.datasets import load_boston >>> from sklearn. preprocessing import MinMaxScaler, StandardScaler >>> from sklearn.linear_model import Lasso >>> from sklearn.svm import SVR >>> >>> X, y = load_boston(True) >>> >>> preprocessing_cases = {'mm': [MinMaxScaler()], ... 'sc': [StandardScaler()]} >>> >>> estimators_per_case = {'mm': [SVR()], ... 'sc': [('can name some or all ests', Lasso())]} >>> >>> ensemble = BlendEnsemble() >>> ensemble.add(estimators_per_case, preprocessing_cases).add(SVR(), ... meta=True) >>> >>> ensemble.fit(X, y) >>> preds = ensemble.predict(X) >>> rmse(y, preds) 8.249013
-
add
(estimators, preprocessing=None, proba=False, meta=False, propagate_features=None, **kwargs)[source]¶ Add layer to ensemble.
Parameters: - preprocessing (dict of lists or list, optional (default = None)) –
preprocessing pipelines for given layer. If the same preprocessing applies to all estimators,
preprocessing
should be a list of transformer instances. The list can contain the instances directly, named tuples of transformers, or a combination of both.option_1 = [transformer_1, transformer_2] option_2 = [("trans-1", transformer_1), ("trans-2", transformer_2)] option_3 = [transformer_1, ("trans-2", transformer_2)]
If different preprocessing pipelines are desired, a dictionary that maps preprocessing pipelines must be passed. The names of the preprocessing dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - estimators (dict of lists or list or instance) –
estimators constituting the layer. If preprocessing is none and the layer is meant to be the meta estimator, it is permissible to pass a single instantiated estimator. If
preprocessing
isNone
orlist
,estimators
should be alist
. The list can either contain estimator instances, named tuples of estimator instances, or a combination of both.option_1 = [estimator_1, estimator_2] option_2 = [("est-1", estimator_1), ("est-2", estimator_2)] option_3 = [estimator_1, ("est-2", estimator_2)]
If different preprocessing pipelines are desired, a dictionary that maps estimators to preprocessing pipelines must be passed. The names of the estimator dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - proba (bool (default = False)) – Whether to call
predict_proba
on base learners. - propagate_features (list, optional) – List of column indexes to propagate from the input of
the layer to the output of the layer. Propagated features are
concatenated and stored in the leftmost columns of the output
matrix. The
propagate_features
list should define a slice of the numpy array containing the input data, e.g.[0, 1]
to propagate the first two columns of the input matrix to the output matrix. - meta (bool (default = False)) – Whether the layer should be treated as the final meta estimator.
- **kwargs (optional) – optional keyword arguments to instantiate layer with.
Returns: self – ensemble instance with layer instantiated.
Return type: instance
- preprocessing (dict of lists or list, optional (default = None)) –
- test_size (int, float (default = 0.5)) – the size of the test set for each layer. This parameter can be
overridden in the
TemporalEnsemble¶
-
class
mlens.ensemble.
TemporalEnsemble
(step_size=1, burn_in=None, window=None, lag=0, scorer=None, raise_on_exception=True, array_check=None, verbose=False, n_jobs=-1, backend='threading', model_selection=False, sample_size=20, layers=None)[source]¶ Bases:
mlens.ensemble.base.BaseEnsemble
Temporal ensemble class.
The temporal ensemble class uses a time series cross-validation strategy to create training and test folds that preserve temporal ordering in the data. The cross validation strategy is unrolled through time. For instance:
fold train obs test obs 0 0, 1, 2, 3 4 1 0, 1, 2, 3, 4 5 2 0, 1, 2, 3, 4, 5 6 Different estimators in the ensemble can operate on different time scales, allow efficient combinations of different temporal patterns in one model.
See also
Note
All parameters can be overriden in the
add
method unless otherwise specified. Notably, thebackend
andn_jobs
cannot be altered in theadd
method.Parameters: - step_size (int (default=1)) – number of samples to use in each test fold. The final window size may be smaller if too few observations remain.
- burn_in (int (default=None)) – number of samples to use for first training fold. These observations
will be dropped from the output. Defaults to
step_size
. - window (int (default=None)) – number of previous samples to use in each training fold, except first
which is determined by
burn_in
. IfNone
, will use all previous observations. - lag (int (default=0)) – distance between the most recent training point in the training fold and
the first test point. For
lag>0
, the training fold and the test fold will not be contiguous. - scorer (object (default = None)) – scoring function. If a function is provided, base estimators will be
scored on the training set assembled for fitting the meta estimator.
Since those predictions are out-of-sample, the scores represent valid
test scores. The scorer should be a function that accepts an array of
true values and an array of predictions:
score = f(y_true, y_pred)
. - raise_on_exception (bool (default = True)) – whether to issue warnings on soft exceptions or raise error.
Examples include lack of layers, bad inputs, and failed fit of an
estimator in a layer. If set to
False
, warnings are issued instead but estimation continues unless exception is fatal. Note that this can result in unexpected behavior unless the exception is anticipated. - verbose (int or bool (default = False)) –
level of verbosity.
verbose = 0
silent (same asverbose = False
)verbose = 1
messages at start and finish (same asverbose = True
)verbose = 2
messages for each layer
If
verbose >= 50
prints tosys.stdout
, elsesys.stderr
. For verbosity in the layers themselves, usefit_params
. - n_jobs (int (default = -1)) – Degree of parallel processing. Set to -1 for maximum parallelism and
1 for sequential processing. Cannot be overriden in the
add
method. - backend (str or object (default = 'threading')) – backend infrastructure to use during call to
mlens.externals.joblib.Parallel
. See Joblib for further documentation. To set global backend, setmlens.config._BACKEND
. Cannot be overriden in theadd
method. - model_selection (bool (default=False)) – Whether to use the ensemble in model selection mode. If
True
, this will alter thetransform
method. When callingtransform
on new data, the ensemble will callpredict
, while callingtransform
with the training data reproduces predictions from thefit
call. Hence the ensemble can be used as a pure transformer in a preprocessing pipeline passed to theEvaluator
, as training folds are faithfully reproduced as during afit``call and test folds are transformed with the ``predict
method. - sample_size (int (default=20)) – size of training set sample
(
[min(sample_size, X.size[0]), min(X.size[1], sample_size)]
)
Examples
>>> from sklearn.linear_model import LinearRegression >>> from mlens.ensemble import TemporalEnsemble >>> import numpy as np >>> >>> x = np.linspace(0, 1, 100) >>> y = x[1:] >>> x = x[:-1] >>> x = x.reshape(-1, 1) >>> >>> ens = TemporalEnsemble(window=1) >>> ens.add(LinearRegression()) >>> >>> ens.fit(x, y) >>> p = ens.predict(x) >>> >>> >>> print("{:5} | {:5}".format('pred', 'truth')) >>> for i in range(5, 10): ... print("{:.3f} | {:.3f}".format(p[i], y[i])) >>> pred | truth 0.061 | 0.061 0.071 | 0.071 0.081 | 0.081 0.091 | 0.091 0.101 | 0.101
-
add
(estimators, preprocessing=None, proba=False, meta=False, propagate_features=None, **kwargs)[source]¶ Add layer to ensemble.
Parameters: - estimators (dict of lists or list or instance) –
estimators constituting the layer. If preprocessing is none and the layer is meant to be the meta estimator, it is permissible to pass a single instantiated estimator. If
preprocessing
isNone
orlist
,estimators
should be alist
. The list can either contain estimator instances, named tuples of estimator instances, or a combination of both.option_1 = [estimator_1, estimator_2] option_2 = [("est-1", estimator_1), ("est-2", estimator_2)] option_3 = [estimator_1, ("est-2", estimator_2)]
If different preprocessing pipelines are desired, a dictionary that maps estimators to preprocessing pipelines must be passed. The names of the estimator dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - preprocessing (dict of lists or list, optional (default = None)) –
preprocessing pipelines for given layer. If the same preprocessing applies to all estimators,
preprocessing
should be a list of transformer instances. The list can contain the instances directly, named tuples of transformers, or a combination of both.option_1 = [transformer_1, transformer_2] option_2 = [("trans-1", transformer_1), ("trans-2", transformer_2)] option_3 = [transformer_1, ("trans-2", transformer_2)]
If different preprocessing pipelines are desired, a dictionary that maps preprocessing pipelines must be passed. The names of the preprocessing dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - proba (bool) – whether layer should predict class probabilities. Note: setting
proba=True
will attempt to call an the estimatorspredict_proba
method. - propagate_features (list, optional) – List of column indexes to propagate from the input of
the layer to the output of the layer. Propagated features are
concatenated and stored in the leftmost columns of the output
matrix. The
propagate_features
list should define a slice of the numpy array containing the input data, e.g.[0, 1]
to propagate the first two columns of the input matrix to the output matrix. - meta (bool (default = False)) – indicator if the layer added is the final meta estimator. This will prevent folded or blended fits of the estimators and only fit them once on the full input data.
- **kwargs (optional) – optional keyword arguments.
Returns: self – ensemble instance with layer instantiated.
Return type: instance
- estimators (dict of lists or list or instance) –
Sequential¶
-
class
mlens.ensemble.
SequentialEnsemble
(shuffle=False, random_state=None, scorer=None, raise_on_exception=True, array_check=None, verbose=False, n_jobs=-1, backend=None, model_selection=False, sample_size=20, layers=None)[source]¶ Bases:
mlens.ensemble.base.BaseEnsemble
Sequential Ensemble class.
The Sequential Ensemble class allows users to build ensembles with different classes of layers. The type of layer and its parameters are specified when added to the ensemble. See respective ensemble class for details on parameters.
See also
Parameters: - shuffle (bool (default = False)) – whether to shuffle data before before processing each layer.
For greater control, specify
shuffle
when adding the layer. - random_state (int (default = None)) – random seed if shuffling inputs.
- scorer (object (default = None)) – scoring function. If a function is provided, base estimators will be
scored on the training set assembled for fitting the meta estimator.
Since those predictions are out-of-sample, the scores represent valid
test scores. The scorer should be a function that accepts an array of
true values and an array of predictions:
score = f(y_true, y_pred)
. - raise_on_exception (bool (default = True)) – whether to issue warnings on soft exceptions or raise error.
Examples include lack of layers, bad inputs, and failed fit of an
estimator in a layer. If set to
False
, warnings are issued instead but estimation continues unless exception is fatal. Note that this can result in unexpected behavior unless the exception is anticipated. - verbose (int or bool (default = False)) –
level of verbosity.
verbose = 0
silent (same asverbose = False
)verbose = 1
messages at start and finish (same asverbose = True
)verbose = 2
messages for each layer
If
verbose >= 50
prints tosys.stdout
, elsesys.stderr
. For verbosity in the layers themselves, usefit_params
. - n_jobs (int (default = -1)) – number of CPU cores to use for fitting and prediction.
- backend (str or object (default = 'threading')) – backend infrastructure to use during call to
mlens.externals.joblib.Parallel
. See Joblib for further documentation. To change global backend, setmlens.config._BACKEND
- model_selection (bool (default=False)) – Whether to use the ensemble in model selection mode. If
True
, this will alter thetransform
method. When callingtransform
on new data, the ensemble will callpredict
, while callingtransform
with the training data reproduces predictions from thefit
call. Hence the ensemble can be used as a pure transformer in a preprocessing pipeline passed to theEvaluator
, as training folds are faithfully reproduced as during afit``call and test folds are transformed with the ``predict
method. - sample_size (int (default=20)) – size of training set sample
(
[min(sample_size, X.size[0]), min(X.size[1], sample_size)]
)
Examples
>>> from mlens.ensemble import SequentialEnsemble >>> from mlens.metrics.metrics import rmse >>> from sklearn.datasets import load_boston >>> from sklearn.linear_model import Lasso >>> from sklearn.svm import SVR >>> >>> X, y = load_boston(True) >>> >>> ensemble = SequentialEnsemble() >>> >>> # Add a subsemble with 5 partitions as first layer >>> ensemble.add('subsemble', [SVR(), Lasso()], partitions=10, folds=10) >>> >>> # Add a super learner as second layer >>> ensemble.add('stack', [SVR(), Lasso()], folds=20) >>> >>> # Specify a meta estimator >>> ensemble.add_meta(SVR()) >>> >>> ensemble.fit(X, y) >>> preds = ensemble.predict(X) >>> rmse(y, preds) 6.5628...
-
add
(cls, estimators, preprocessing=None, meta=False, **kwargs)[source]¶ Add layer to ensemble.
For full set of optional arguments, see the ensemble API for the specified type.
Parameters: - cls (str) –
layer class. Accepted types are:
- ’blend’ : blend ensemble
- ’subsemble’ : subsemble
- ’stack’ : super learner
- estimators (dict of lists or list or instance) –
estimators constituting the layer. If preprocessing is none and the layer is meant to be the meta estimator, it is permissible to pass a single instantiated estimator. If
preprocessing
isNone
orlist
,estimators
should be alist
. The list can either contain estimator instances, named tuples of estimator instances, or a combination of both.option_1 = [estimator_1, estimator_2] option_2 = [("est-1", estimator_1), ("est-2", estimator_2)] option_3 = [estimator_1, ("est-2", estimator_2)]
If different preprocessing pipelines are desired, a dictionary that maps estimators to preprocessing pipelines must be passed. The names of the estimator dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - preprocessing (dict of lists or list, optional (default = None)) –
preprocessing pipelines for given layer. If the same preprocessing applies to all estimators,
preprocessing
should be a list of transformer instances. The list can contain the instances directly, named tuples of transformers, or a combination of both.option_1 = [transformer_1, transformer_2] option_2 = [("trans-1", transformer_1), ("trans-2", transformer_2)] option_3 = [transformer_1, ("trans-2", transformer_2)]
If different preprocessing pipelines are desired, a dictionary that maps preprocessing pipelines must be passed. The names of the preprocessing dictionary must correspond to the names of the estimator dictionary.
preprocessing_cases = {"case-1": [trans_1, trans_2], "case-2": [alt_trans_1, alt_trans_2]} estimators = {"case-1": [est_a, est_b], "case-2": [est_c, est_d]}
The lists for each dictionary entry can be any of
option_1
,option_2
andoption_3
. - **kwargs (optional) – optional keyword arguments to instantiate layer with. See respective ensemble for further details.
Returns: self – ensemble instance with layer instantiated.
Return type: instance
- cls (str) –
- shuffle (bool (default = False)) – whether to shuffle data before before processing each layer.
For greater control, specify