ML-ENSEMBLE

author:	Sebastian Flennerhag
copyright:	2017-2018
license:	MIT

mlens.visualization¶

corrmat¶

mlens.visualization.corrmat(corr, figsize=(11, 9), annotate=True, inflate=True, linewidths=0.5, cbar_kws='default', show=True, ax=None, title='Correlation Matrix', title_font_size=14, **kwargs)[source]¶

Function for generating color-coded correlation triangle.

Parameters:	corr (array-like of shape = [n_features, n_features]) – Input correlation matrix. Pass a pandas `DataFrame` for axis labels. figsize (tuple (default = (11, 9))) – Size of printed figure. annotate (bool (default = True)) – Whether to print the correlation coefficients. inflate (bool (default = True)) – Whether to inflate correlation coefficients to a 0-100 scale. Avoids decimal points in the figure, which often appears very cluttered otherwise. linewidths (float) – with of line separating each coordinate square. cbar_kws (dict, str (default = 'default')) – Optional arguments to color bar. The default options, ‘default’, passes the `shrink` parameter to fit colorbar standard figure frame. show (bool (default = True)) – whether to print figure using `matplotlib.pyplot.show`. title (str) – figure title if shown. title_font_size (int) – title font size. ax (object, optional) – axis to attach plot to. *kwargs (optional*) – Other optional arguments to sns heatmap.
Returns:	ax – axis object.
Return type:	object

clustered_corrmap¶

mlens.visualization.clustered_corrmap(corr, cls, label_attr_name='labels_', figsize=(10, 8), annotate=False, inflate=False, linewidths=0.5, cbar_kws='default', show=True, title_fontsize=14, title_name='Clustered correlation heatmap', ax=None, **kwargs)[source]¶

Function for plotting a clustered correlation heatmap.

Parameters:

corr (array-like of shape = [n_features, n_features]) – Input correlation matrix. Pass a pandas DataFrame for axis labels.
cls (instance) – cluster estimator with a fit method and cluster labels stored as an attribute as specified by the label_attr_name parameter.
label_attr_name (str) – name of attribute that contains cluster labels.
figsize (tuple (default = (10, 8))) – Size of figure.
annotate (bool (default = True)) – Whether to print the correlation coefficients.
inflate (bool (default = True)) – Whether to inflate correlation coefficients to a 0-100 scale. Avoids decimal points in the figure, which often appears very cluttered otherwise.
linewidths (float (default = .5)) – with of line separating each coordinate square.
cbar_kws (dict, str (default = 'default')) – Optional arguments to color bar.
title_name (str) – Figure title.
title_fontsize (int) – size of title.
show (bool (default = True)) – whether to print figure using matplotlib.pyplot.show.
ax (object, optional) – axis to attach plot to.
**kwargs (optional) – Other optional arguments to sns heatmap.

corr_X_y¶

mlens.visualization.corr_X_y(X, y, top=5, figsize=(10, 8), fontsize=12, hspace=None, no_ticks=True, label_rotation=0, show=True)[source]¶

Function for plotting input feature correlations with output.

Output figure shows all correlations as well as top pos and neg.

Parameters:	X (pandas DataFrame of shape = [n_samples, n_features]) – Input data. y (pandas Series of shape = [n_samples,]) – training labels. top (int) – number of features to show in top pos and neg graphs. figsize (tuple (default = (10, 8))) – Size of figure. hspace (float, optional) – whitespace between top row of figures and bottom figure. fontsize (int) – font size of subplot titles. no_ticks (bool (default = False)) – whether to remove ticklabels from full correlation plot. label_rotation (float (default = 0)) – rotation of labels show (bool (default = True)) – whether to print figure using `matplotlib.pyplot.show`.
Returns:	ax – axis object.
Return type:	object

pca_plot¶

mlens.visualization.pca_plot(X, estimator, y=None, cmap=None, figsize=(10, 8), title='Principal Components Analysis', title_font_size=14, show=True, ax=None, **kwargs)[source]¶

Function to plot a PCA analysis of 1, 2, or 3 dims.

Parameters:	X (array-like of shape = [n_samples, n_features]) – matrix to perform PCA analysis on. estimator (instance) – PCA estimator. Assumes a Scikit-learn API. y (array-like of shape = [n_samples, ] or None (default = None)) – training labels to be used for color highlighting. cmap (object, optional) – cmap object to pass to `matplotlib.pyplot.scatter`. figsize (tuple (default = (10, 8))) – Size of figure. title (str) – figure title if shown. title_font_size (int) – title font size. show (bool (default = True)) – whether to print figure `matplotlib.pyplot.show`. ax (object, optional) – axis to attach plot to. *kwargs (optional*) – arguments to pass to `matplotlib.pyplot.scatter`.
Returns:	ax – if `ax` was specified, returns `ax` with plot attached.
Return type:	optional

pca_comp_plot¶

mlens.visualization.pca_comp_plot(X, y=None, figsize=(10, 8), title='Principal Components Comparison', title_font_size=14, show=True, **kwargs)[source]¶

Function for comparing PCA analysis.

Function compares across 2 and 3 dimensions and linear and rbf kernels.

Parameters:	X (array-like of shape = [n_samples, n_features]) – input matrix to be used for prediction. y (array-like of shape = [n_samples, ] or None (default = None)) – training labels to be used for color highlighting. figsize (tuple (default = (10, 8))) – Size of figure. title (str) – figure title if shown. title_font_size (int) – title font size. show (bool (default = True)) – whether to print figure `matplotlib.pyplot.show`. *kwargs (optional*) – optional arguments to pass to `mlens.visualization.pca_plot`.
Returns:	axis object.
Return type:	ax

exp_var_plot¶

mlens.visualization.exp_var_plot(X, estimator, figsize=(10, 8), buffer=0.01, set_labels=True, title='Explained variance ratio', title_font_size=14, show=True, ax=None, **kwargs)[source]¶

Function to plot the explained variance using PCA.

Parameters:	X (array-like of shape = [n_samples, n_features]) – input matrix to be used for prediction. estimator (class) – PCA estimator, not initiated, assumes a Scikit-learn API. figsize (tuple (default = (10, 8))) – Size of figure. buffer (float (default = 0.01)) – For creating a buffer around the edges of the graph. The buffer added is calculated as `num_components` * `buffer`, where `num_components` determine the length of the x-axis. set_labels (bool) – whether to set axis labels. title (str) – figure title if shown. title_font_size (int) – title font size. show (bool (default = True)) – whether to print figure using `matplotlib.pyplot.show`. ax (object, optional) – axis to attach plot to. *kwargs (optional*) – optional arguments passed to the `matplotlib.pyplot.step` function.
Returns:	ax – if `ax` was specified, returns `ax` with plot attached.
Return type:	optional