ML-ENSEMBLE
author: | Sebastian Flennerhag |
---|---|
copyright: | 2017-2018 |
license: | MIT |
Metric utilities and functions.
mlens.metrics¶
Data¶
-
class
mlens.metrics.
Data
(data=None, padding=2, decimals=2)[source]¶ Bases:
collections.OrderedDict
Wrapper class around dict to get pretty prints
Data
is an ordered dictionary that implements a dedicated pretty print method for a nested dictionary. Printing aData
dictionary provides a human-readable table. The input dictionary is expected to have two levels: the first level gives the columns and the second level the rows. Rows names are parsed as[OUTER]/[MIDDLE].[INNER]--[IDX]
, where IDX has to be an integer. All entries are optional.See also
Warning
Data
is an internal class that expects a particular functions. This class cannot be used as a general drop-in replacement for the standarddict
class.Examples
>>> from mlens.metrics import Data >>> d = [('row-idx-1.row-idx-2.0.0', {'column-1': 0.1, 'column-2': 0.1})] >>> data = Data(d) >>> print(data) column-a column-b row-idx-1 row-idx-2 0.10 0.20
assemble_data¶
-
mlens.metrics.
assemble_data
(data_list)[source]¶ Build a data dictionary out of a list of entries and data dicts
Given a list named tuples of dictionaries,
assemble_data()
returns a nested ordered dictionary with data keys as outer keys and tuple names as inner keys. The returned dictionary can be printed in tabular format byassemble_table()
.See also
Examples
>>> from mlens.metrics import assemble_data, assemble_table >>> d = [('row-idx-1.row-idx-2.a.b', {'column-1': 0.1, 'column-2': 0.1})] >>> print(assemble_table(assemble_data(d))) column-2-m column-2-s column-1-m column-1-s row-idx-1 row-idx-2 0.10 0.00 0.10 0.00
assemble_table¶
-
mlens.metrics.
assemble_table
(data, padding=2, decimals=2)[source]¶ Construct data table from input dict
Given a nested dictionary formed by
assemble_data()
,assemble_table()
returns a string that prints the contents of the input in tabular format. The input dictionary is expected to have two levels: the first level gives the columns and the second level the rows. Rows names are parsed as[OUTER]/[MIDDLE].[INNER]--[IDX]
, where IDX must be an integer. All entries are optional.See also
Examples
>>> from mlens.metrics import assemble_data, assemble_table >>> d = [('row-idx-1.row-idx-2.a.b', {'column-1': 0.1, 'column-2': 0.1})] >>> print(assemble_table(assemble_data(d))) column-2-m column-2-s column-1-m column-1-s row-idx-1 row-idx-2 0.10 0.00 0.10 0.00
make_scorer¶
-
mlens.metrics.
make_scorer
(score_func, greater_is_better=True, needs_proba=False, needs_threshold=False, **kwargs)[source]¶ Make a scorer from a performance metric or loss function.
This factory function wraps scoring functions for use in GridSearchCV and cross_val_score. It takes a score function, such as
accuracy_score
,mean_squared_error
,adjusted_rand_index
oraverage_precision
and returns a callable that scores an estimator’s output.Read more in the User Guide.
Parameters: - score_func (callable,) – Score function (or loss function) with signature
score_func(y, y_pred, **kwargs)
. - greater_is_better (boolean, default=True) – Whether score_func is a score function (default), meaning high is good, or a loss function, meaning low is good. In the latter case, the scorer object will sign-flip the outcome of the score_func.
- needs_proba (boolean, default=False) – Whether score_func requires predict_proba to get probability estimates out of a classifier.
- needs_threshold (boolean, default=False) –
Whether score_func takes a continuous decision certainty. This only works for binary classification using estimators that have either a decision_function or predict_proba method.
For example
average_precision
or the area under the roc curve can not be computed using discrete predictions alone. - **kwargs (additional arguments) – Additional parameters to be passed to score_func.
Returns: scorer – Callable object that returns a scalar score; greater is better.
Return type: callable
Examples
>>> from sklearn.metrics import fbeta_score, make_scorer >>> ftwo_scorer = make_scorer(fbeta_score, beta=2) >>> ftwo_scorer make_scorer(fbeta_score, beta=2) >>> from sklearn.model_selection import GridSearchCV >>> from sklearn.svm import LinearSVC >>> grid = GridSearchCV(LinearSVC(), param_grid={'C': [1, 10]}, ... scoring=ftwo_scorer)
- score_func (callable,) – Score function (or loss function) with signature
rmse¶
-
mlens.metrics.
rmse
(y, p)[source]¶ Root Mean Square Error.
\[RMSE(\mathbf{y}, \mathbf{p}) = \sqrt{MSE(\mathbf{y}, \mathbf{p})},\]with
\[MSE(\mathbf{y}, \mathbf{p}) = |S| \sum_{i \in S} (y_i - p_i)^2\]Parameters: - y (array-like of shape [n_samples, ]) – ground truth.
- p (array-like of shape [n_samples, ]) – predicted labels.
Returns: z – root mean squared error.
Return type:
mape¶
-
mlens.metrics.
mape
(y, p)[source]¶ Mean Average Percentage Error.
\[MAPE(\mathbf{y}, \mathbf{p}) = |S| \sum_{i \in S} | \frac{y_i - p_i}{y_i} |\]Parameters: - y (array-like of shape [n_samples, ]) – ground truth.
- p (array-like of shape [n_samples, ]) – predicted labels.
Returns: z – mean average percentage error.
Return type:
wape¶
-
mlens.metrics.
wape
(y, p)[source]¶ Weighted Mean Average Percentage Error.
\[WAPE(\mathbf{y}, \mathbf{p}) = \frac{\sum_{i \in S} | y_i - p_i|}{ \sum_{i \in S} |y_i|}\]Parameters: - y (array-like of shape [n_samples, ]) – ground truth.
- p (array-like of shape [n_samples, ]) – predicted labels.
Returns: z – weighted mean average percentage error.
Return type: