ML-ENSEMBLE

author:	Sebastian Flennerhag
copyright:	2017-2018
license:	MIT

Metric utilities and functions.

mlens.metrics¶

Data¶

class mlens.metrics.Data(data=None, padding=2, decimals=2)[source]¶

Bases: collections.OrderedDict

Wrapper class around dict to get pretty prints

Data is an ordered dictionary that implements a dedicated pretty print method for a nested dictionary. Printing a Data dictionary provides a human-readable table. The input dictionary is expected to have two levels: the first level gives the columns and the second level the rows. Rows names are parsed as [OUTER]/[MIDDLE].[INNER]--[IDX], where IDX has to be an integer. All entries are optional.

assemble_data¶

mlens.metrics.assemble_data(data_list)[source]¶

Build a data dictionary out of a list of entries and data dicts

Given a list named tuples of dictionaries, assemble_data() returns a nested ordered dictionary with data keys as outer keys and tuple names as inner keys. The returned dictionary can be printed in tabular format by assemble_table().

assemble_table¶

mlens.metrics.assemble_table(data, padding=2, decimals=2)[source]¶

Construct data table from input dict

Given a nested dictionary formed by assemble_data(), assemble_table() returns a string that prints the contents of the input in tabular format. The input dictionary is expected to have two levels: the first level gives the columns and the second level the rows. Rows names are parsed as [OUTER]/[MIDDLE].[INNER]--[IDX], where IDX must be an integer. All entries are optional.

make_scorer¶

mlens.metrics.make_scorer(score_func, greater_is_better=True, needs_proba=False, needs_threshold=False, **kwargs)[source]¶

Make a scorer from a performance metric or loss function.

This factory function wraps scoring functions for use in GridSearchCV and cross_val_score. It takes a score function, such as accuracy_score, mean_squared_error, adjusted_rand_index or average_precision and returns a callable that scores an estimator’s output.

Read more in the User Guide.

Parameters:	score_func (callable,) – Score function (or loss function) with signature `score_func(y, y_pred, kwargs)`. greater_is_better** (boolean, default=True) – Whether score_func is a score function (default), meaning high is good, or a loss function, meaning low is good. In the latter case, the scorer object will sign-flip the outcome of the score_func. needs_proba (boolean, default=False) – Whether score_func requires predict_proba to get probability estimates out of a classifier. needs_threshold (boolean, default=False) – Whether score_func takes a continuous decision certainty. This only works for binary classification using estimators that have either a decision_function or predict_proba method. For example `average_precision` or the area under the roc curve can not be computed using discrete predictions alone. *kwargs (additional arguments*) – Additional parameters to be passed to score_func.
Returns:	scorer – Callable object that returns a scalar score; greater is better.
Return type:	callable

Examples

>>> from sklearn.metrics import fbeta_score, make_scorer
>>> ftwo_scorer = make_scorer(fbeta_score, beta=2)
>>> ftwo_scorer
make_scorer(fbeta_score, beta=2)
>>> from sklearn.model_selection import GridSearchCV
>>> from sklearn.svm import LinearSVC
>>> grid = GridSearchCV(LinearSVC(), param_grid={'C': [1, 10]},
...                     scoring=ftwo_scorer)

rmse¶

mlens.metrics.rmse(y, p)[source]¶

Root Mean Square Error.

\[RMSE(\mathbf{y}, \mathbf{p}) = \sqrt{MSE(\mathbf{y}, \mathbf{p})},\]

with

\[MSE(\mathbf{y}, \mathbf{p}) = |S| \sum_{i \in S} (y_i - p_i)^2\]

Parameters:	y (array-like of shape [n_samples, ]) – ground truth. p (array-like of shape [n_samples, ]) – predicted labels.
Returns:	z – root mean squared error.
Return type:	float

mape¶

mlens.metrics.mape(y, p)[source]¶

Mean Average Percentage Error.

\[MAPE(\mathbf{y}, \mathbf{p}) = |S| \sum_{i \in S} | \frac{y_i - p_i}{y_i} |\]

Parameters:	y (array-like of shape [n_samples, ]) – ground truth. p (array-like of shape [n_samples, ]) – predicted labels.
Returns:	z – mean average percentage error.
Return type:	float

wape¶

mlens.metrics.wape(y, p)[source]¶

Weighted Mean Average Percentage Error.

\[WAPE(\mathbf{y}, \mathbf{p}) = \frac{\sum_{i \in S} | y_i - p_i|}{ \sum_{i \in S} |y_i|}\]

Parameters:	y (array-like of shape [n_samples, ]) – ground truth. p (array-like of shape [n_samples, ]) – predicted labels.
Returns:	z – weighted mean average percentage error.
Return type:	float