slickml.visualization#

Package Contents#

Functions#

plot_binary_classification_metrics(, save_path, ...)

Visualizes binary classification metrics using plotting_dict_ attribute of BinaryClassificationMetrics.

plot_glmnet_coeff_path(, linestyle, fontsize, ...)

Visualizes the GLMNet coefficients' paths.

plot_glmnet_cv_results(, marker, markersize, ...)

Visualizes the GLMNet cross-validation results.

plot_regression_metrics(, save_path, display_plot, ...)

Visualizes regression metrics using plotting_dict_ attribute of RegressionMetrics.

plot_shap_summary(→ None)

Visualizes shap beeswarm plot as summary of shapley values.

plot_shap_waterfall(, bar_color, bar_thickness, ...)

Visualizes the Shapley values as a waterfall plot. pl

plot_xfs_cv_results(, internalcvcolor, ...)

Visualizies the cross-validation results of XGBoostFeatureSelector.

plot_xfs_feature_frequency(, show_freq_pct, color, ...)

Visualizes the selected features frequency as a bar chart.

plot_xgb_cv_results(, linestyle, train_label, ...)

Visualizes the cv_results of XGBoostCVClassifier.

plot_xgb_feature_importance(, color, marker, ...)

Visualizes the XGBoost feature importance as a bar chart.

slickml.visualization.plot_binary_classification_metrics(figsize: Optional[Tuple[float, float]] = (12, 12), save_path: Optional[str] = None, display_plot: Optional[bool] = False, return_fig: Optional[bool] = False, **kwargs: Dict[str, Any]) Optional[matplotlib.figure.Figure][source]#

Visualizes binary classification metrics using plotting_dict_ attribute of BinaryClassificationMetrics.

Parameters:
  • figsize (Tuple[float, float], optional) – Figure size, by default (12, 12)

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default False

  • return_fig (bool, optional) – Whether to return figure object, by default False

  • **kwargs (Dict[str, Any]) – Key-value pairs of regression metrics plot

Returns:

Figure, optional

slickml.visualization.plot_glmnet_coeff_path(figsize: Optional[Tuple[Union[int, float], Union[int, float]]] = (8, 5), linestyle: Optional[str] = '-', fontsize: Optional[Union[int, float]] = 12, grid: Optional[bool] = True, legend: Optional[bool] = True, legendloc: Optional[Union[int, str]] = 'center', xlabel: Optional[str] = None, ylabel: Optional[str] = 'Coefficients', title: Optional[str] = None, bbox_to_anchor: Tuple[float, float] = (1.1, 0.5), yscale: Optional[str] = 'linear', save_path: Optional[str] = None, display_plot: Optional[bool] = True, return_fig: Optional[bool] = False, **kwargs: Dict[str, Any]) Optional[matplotlib.figure.Figure][source]#

Visualizes the GLMNet coefficients’ paths.

Parameters:
  • figsize (tuple, optional) – Figure size, by default (8, 5)

  • linestyle (str, optional) – Linestyle of paths, by default “-”

  • fontsize (Union[int, float], optional) – Fontsize of the title. The fontsizes of xlabel, ylabel, tick_params, and legend are resized with 0.85, 0.85, 0.75, and 0.85 fraction of title fontsize, respectively, by default 12

  • grid (bool, optional) – Whether to show (x,y) grid on the plot or not, by default True

  • legend (bool, optional) – Whether to show legend on the plot or not, by default True

  • legendloc (Union[int, str], optional) – Location of legend, by default “center”

  • xlabel (str, optional) – Xlabel of the plot, by default “-Log(Lambda)”

  • ylabel (str, optional) – Ylabel of the plot, by default “Coefficients”

  • title (str, optional) – Title of the plot, by default “Best {lambda_best} with {n} Features”

  • yscale (str, optional) – Scale for y-axis (coefficients). Possible options are "linear", "log", "symlog", "logit" [yscale], by default “linear”

  • bbox_to_anchor (Tuple[float, float], optional) – Relative coordinates for legend location outside of the plot, by default (1.1, 0.5)

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default True

  • return_fig (bool, optional) – Whether to return figure object, by default False

  • **kwargs (Dict[str, Any]) – Key-value pairs of results. results_ attribute can be used

References

Returns:

Figure, optional

slickml.visualization.plot_glmnet_cv_results(figsize: Optional[Tuple[Union[int, float], Union[int, float]]] = (8, 5), marker: Optional[str] = 'o', markersize: Optional[Union[int, float]] = 5, color: Optional[str] = 'red', errorbarcolor: Optional[str] = 'black', maxlambdacolor: Optional[str] = 'purple', bestlambdacolor: Optional[str] = 'navy', linestyle: Optional[str] = '--', fontsize: Optional[Union[int, float]] = 12, grid: Optional[bool] = True, legend: Optional[bool] = True, legendloc: Optional[Union[int, str]] = 'best', xlabel: Optional[str] = None, ylabel: Optional[str] = None, title: Optional[str] = None, save_path: Optional[str] = None, display_plot: Optional[bool] = True, return_fig: Optional[bool] = False, **kwargs: Dict[str, Any]) Optional[matplotlib.figure.Figure][source]#

Visualizes the GLMNet cross-validation results.

Notes

This plotting function can be used along with results_ attribute of any of GLMNetCVClassifier, or GLMNetCVRegressor classes as kwargs.

Parameters:
  • figsize (tuple, optional) – Figure size, by default (8, 5)

  • marker (str, optional) – Marker style of the metric to distinguish the error bars. More valid marker styles can be found at [markers-api], by default “o”

  • markersize (Union[int, float], optional) – Markersize, by default 5

  • color (str, optional) – Line and marker color, by default “red”

  • errorbarcolor (str, optional) – Error bar color, by default “black”

  • maxlambdacolor (str, optional) – Color of vertical line for lambda_max_, by default “purple”

  • bestlambdacolor (str, optional) – Color of vertical line for lambda_best_, by default “navy”

  • linestyle (str, optional) – Linestyle of vertical lambda lines, by default “–”

  • fontsize (Union[int, float], optional) – Fontsize of the title. The fontsizes of xlabel, ylabel, tick_params, and legend are resized with 0.85, 0.85, 0.75, and 0.85 fraction of title fontsize, respectively, by default 12

  • grid (bool, optional) – Whether to show (x,y) grid on the plot or not, by default True

  • legend (bool, optional) – Whether to show legend on the plot or not, by default True

  • legendloc (Union[int, str], optional) – Location of legend, by default “best”

  • xlabel (str, optional) – Xlabel of the plot, by default “-Log(Lambda)”

  • ylabel (str, optional) – Ylabel of the plot, by default “{n_splits}-Folds CV Mean {metric}”

  • title (str, optional) – Title of the plot, by default “Best {lambda_best} with {n} Features”

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default True

  • return_fig (bool, optional) – Whether to return figure object, by default False

  • **kwargs (Dict[str, Any]) – Key-value pairs of results. results_ attribute can be used

References

Returns:

Figure, optional

slickml.visualization.plot_regression_metrics(figsize: Optional[Tuple[float, float]] = (12, 16), save_path: Optional[str] = None, display_plot: Optional[bool] = False, return_fig: Optional[bool] = False, **kwargs: Dict[str, Any]) matplotlib.figure.Figure[source]#

Visualizes regression metrics using plotting_dict_ attribute of RegressionMetrics.

Parameters:
  • figsize (Tuple[float, float], optional) – Figure size, by default (12, 16)

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default False

  • return_fig (bool, optional) – Whether to return figure object, by default False

  • **kwargs (Dict[str, Any]) – Key-value pairs of regression metrics plot

Returns:

Figure, optional

slickml.visualization.plot_shap_summary(shap_values: numpy.ndarray, features: Union[pandas.DataFrame, numpy.ndarray], *, plot_type: Optional[str] = 'dot', figsize: Optional[Union[str, Tuple[float, float]]] = 'auto', color: Optional[str] = None, cmap: Optional[matplotlib.colors.LinearSegmentedColormap] = None, max_display: Optional[int] = 20, feature_names: Optional[List[str]] = None, layered_violin_max_num_bins: Optional[int] = 10, title: Optional[str] = None, sort: Optional[bool] = True, color_bar: Optional[bool] = True, class_names: Optional[List[str]] = None, class_inds: Optional[List[int]] = None, color_bar_label: Optional[str] = 'Feature Value', save_path: Optional[str] = None, display_plot: Optional[bool] = True) None[source]#

Visualizes shap beeswarm plot as summary of shapley values.

Notes

This is a helper function to plot the shap summary plot based on all types of shap.Explainer including shap.LinearExplainer for linear models, shap.TreeExplainer for tree-based models, and shap.DeepExplainer deep neural network models. More on details are available at [shap-api].

Parameters:
  • shap_values (np.ndarray) – Calculated SHAP values array. For single output explanations such as binary classification problems, this will be a matrix of SHAP values with a shape of (n_samples, n_features). Additionally, for multi-output explanations this would be a list of such matrices of SHAP values (List[np.ndarray])

  • features (Union[pd.DataFrame, np.ndarray]) – The feature matrix that was used to calculate the SHAP values. For the case of Numpy array it is recommened to pass the feature_names list as well for better visualization results

  • plot_type (str, optional) – The type of summary plot where possible options are “bar”, “dot”, “violin”, “layered_violin”, and “compact_dot”. Recommendations are “dot” for single-output such as binary classifications, “bar” for multi-output problems, “compact_dot” for Shap interactions, by default “dot”

  • figsize (tuple, optional) – Figure size where “auto” is auto-scaled figure size based on the number of features that are being displayed. Passing a single float will cause each row to be that many inches high. Passing a pair of floats will scale the plot by that number of inches. If None is passed then the size of the current figure will be left unchanged, by default “auto”

  • color (str, optional) – Color of plots when plot_type="violin" and plot_type=layered_violin" are “RdBl” color-map while color of the horizontal lines when plot_type="bar" is “#D0AAF3”, by default None

  • cmap (LinearSegmentedColormap, optional) – Color map when plot_type="violin" and plot_type=layered_violin", by default “RdBl”

  • max_display (int, optional) – Limit to show the number of features in the plot, by default 20

  • feature_names (List[str], optional) – List of feature names to pass. It should follow the order of features, by default None

  • layered_violin_max_num_bins (int, optional) – The number of bins for calculating the violin plots ranges and outliers, by default 10

  • title (str, optional) – Title of the plot, by default None

  • sort (bool, optional) – Flag to plot sorted shap vlues in descending order, by default True

  • color_bar (bool, optional) – Flag to show a color bar when plot_type="dot" or plot_type="violin"

  • class_names (List[str], optional) – List of class names for multi-output problems, by default None

  • class_inds (List[int], optional) – List of class indices for multi-output problems, by default None

  • color_bar_label (str, optional) – Label for color bar, by default “Feature Value”

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default True

References

Returns:

None

slickml.visualization.plot_shap_waterfall(shap_values: numpy.ndarray, features: Union[pandas.DataFrame, numpy.ndarray], *, figsize: Optional[Tuple[float, float]] = (8, 5), bar_color: Optional[str] = '#B3C3F3', bar_thickness: Optional[Union[float, int]] = 0.5, line_color: Optional[str] = 'purple', marker: Optional[str] = 'o', markersize: Optional[Union[int, float]] = 7, markeredgecolor: Optional[str] = 'purple', markerfacecolor: Optional[str] = 'purple', markeredgewidth: Optional[Union[int, float]] = 1, max_display: Optional[int] = 20, title: Optional[str] = None, fontsize: Optional[Union[int, float]] = 12, save_path: Optional[str] = None, display_plot: Optional[bool] = True, return_fig: Optional[bool] = False) Optional[matplotlib.figure.Figure][source]#

Visualizes the Shapley values as a waterfall plot. pl This function is a helper function to plot the shap summary plot based on all types of shap explainers including tree, linear, and dnn.

Parameters:
  • shap_values (np.ndarray) – Calculated SHAP values array. For single output explanations such as binary classification problems, this will be a matrix of SHAP values with a shape of (n_samples, n_features). Additionally, for multi-output explanations this would be a list of such matrices of SHAP values (List[np.ndarray])

  • features (Union[pd.DataFrame, np.ndarray]) – The feature matrix that was used to calculate the SHAP values. For the case of Numpy array it is recommened to pass the feature_names list as well for better visualization results

  • figsize (Tuple[float, float], optional) – Figure size, by default (8, 5)

  • bar_color (str, optional) – Color of the horizontal bar lines, “#B3C3F3”

  • bar_thickness (Union[float, int], optional) – Thickness (hight) of the horizontal bar lines, by default 0.5

  • line_color (str, optional) – Color of the line plot, by default “purple”

  • marker (str, optional) – Marker style of the lollipops. More valid marker styles can be found at [markers-api], by default “o”

  • markersize (Union[int, float], optional) – Markersize, by default 7

  • markeredgecolor (str, optional) – Marker edge color, by default “purple”

  • markerfacecolor (str, optional) – Marker face color, by default “purple”

  • markeredgewidth (Union[int, float], optional) – Marker edge width, by default 1

  • max_display (int, optional) – Limit to show the number of features in the plot, by default 20

  • title (str, optional) – Title of the plot, by default None

  • fontsize (Union[int, float], optional) – Fontsize for xlabel and ylabel, and ticks parameters, by default 12

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default True

  • return_fig (bool, optional) – Whether to return figure object, by default False

References

Returns:

Figure, optional

slickml.visualization.plot_xfs_cv_results(*, figsize: Optional[Tuple[Union[int, float], Union[int, float]]] = (10, 8), internalcvcolor: Optional[str] = '#4169E1', externalcvcolor: Optional[str] = '#8A2BE2', sharex: Optional[bool] = False, sharey: Optional[bool] = False, save_path: Optional[str] = None, display_plot: Optional[bool] = True, return_fig: Optional[bool] = False, **kwargs: Dict[str, Any]) Optional[matplotlib.figure.Figure][source]#

Visualizies the cross-validation results of XGBoostFeatureSelector.

Notes

It visualizes the internal and external cross-validiation performance during the selection process. The internal refers to the performance of the train/test folds during the xgboost.cv() using metrics rounds to help the best number of boosting round while the external refers to the performance of xgboost.train() based on watchlist using eval_metric. Additionally, sns.distplot previously was used which is now deprecated. More details in [seaborn-distplot-deprecation].

Parameters:
  • figsize (tuple, optional) – Figure size, by default (10, 8)

  • internalcvcolor (str, optional) – Color of the histograms for internal cv results, by default “#4169E1”

  • externalcvcolor (str, optional) – Color of the histograms for external cv results, by default “#8A2BE2”

  • sharex (bool, optional) – Whether to share “X” axis for each column of subplots, by default False

  • sharey (bool, optional) – Whether to share “Y” axis for each row of subplots, by default False

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default True

  • return_fig (bool, optional) – Whether to return figure object, by default False

  • kwargs (Dict[str, Any]) – Required plooting elements (plotting_cv_ attribute of XGBoostFeatureSelector)

See also

slickml.selection.XGBoostFeatureSelector, Refereces, ---------,

Returns:

Figure, optional

slickml.visualization.plot_xfs_feature_frequency(freq: pandas.DataFrame, *, figsize: Optional[Tuple[Union[int, float], Union[int, float]]] = (8, 4), show_freq_pct: Optional[bool] = True, color: Optional[str] = '#87CEEB', marker: Optional[str] = 'o', markersize: Optional[Union[int, float]] = 10, markeredgecolor: Optional[str] = '#1F77B4', markerfacecolor: Optional[str] = '#1F77B4', markeredgewidth: Optional[Union[int, float]] = 1, fontsize: Optional[Union[int, float]] = 12, save_path: Optional[str] = None, display_plot: Optional[bool] = True, return_fig: Optional[bool] = False) Optional[matplotlib.figure.Figure][source]#

Visualizes the selected features frequency as a bar chart.

This plotting function can be used along with feature_frequency_ attribute of any frequency-based feature selection algorithm such as XGBoostFeatureSelector.

feature importancepd.DataFrame

Feature importance (feature_frequency_ attribute)

figsizetuple, optional

Figure size, by default (8, 4)

show_freq_pctbool, optional

Whether to show the features frequency in percent, by default True

colorstr, optional

Color of the horizontal lines of lollipops, by default “#87CEEB”

markerstr, optional

Marker style of the lollipops. More valid marker styles can be found at [markers-api], by default “o”

markersizeUnion[int, float], optional

Markersize, by default 10

markeredgecolorstr, optional

Marker edge color, by default “#1F77B4”

markerfacecolorstr, optional

Marker face color, by defualt “#1F77B4”

markeredgewidthUnion[int, float], optional

Marker edge width, by default 1

fontsizeUnion[int, float], optional

Fontsize for xlabel and ylabel, and ticks parameters, by default 12

save_pathstr, optional

The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

display_plotbool, optional

Whether to show the plot, by default True

return_figbool, optional

Whether to return figure object, by default False

References

Returns:

Figure, optional

slickml.visualization.plot_xgb_cv_results(cv_results: pandas.DataFrame, *, figsize: Optional[Tuple[Union[int, float], Union[int, float]]] = (8, 5), linestyle: Optional[str] = '--', train_label: Optional[str] = 'Train', test_label: Optional[str] = 'Test', train_color: Optional[str] = 'navy', train_std_color: Optional[str] = '#B3C3F3', test_color: Optional[str] = 'purple', test_std_color: Optional[str] = '#D0AAF3', save_path: Optional[str] = None, display_plot: Optional[bool] = False, return_fig: Optional[bool] = False) Optional[matplotlib.figure.Figure][source]#

Visualizes the cv_results of XGBoostCVClassifier.

Parameters:
  • cv_results (pd.DataFrame) – Cross-validation results

  • figsize (Tuple[Union[int, float], Union[int, float]], optional) – Figure size, by default (8, 5)

  • linestyle (str, optional) – Style of lines [linestyles-api], by default “–”

  • train_label (str, optional) – Label in the figure legend for the train line, by default “Train”

  • test_label (str, optional) – Label in the figure legend for the test line, by default “Test”

  • train_color (str, optional) – Color of the training line, by default “navy”

  • train_std_color (str, optional) – Color of the edge color of the training std bars, by default “#B3C3F3”

  • test_color (str, optional) – Color of the testing line, by default “purple”

  • test_std_color (str, optional) – Color of the edge color of the testing std bars, by default “#D0AAF3”

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default False

  • return_fig (bool, optional) – Whether to return figure object, by default False

References

Returns:

Figure, optional

slickml.visualization.plot_xgb_feature_importance(feature_importance: pandas.DataFrame, *, figsize: Optional[Tuple[Union[int, float], Union[int, float]]] = (8, 5), color: Optional[str] = '#87CEEB', marker: Optional[str] = 'o', markersize: Optional[Union[int, float]] = 10, markeredgecolor: Optional[str] = '#1F77B4', markerfacecolor: Optional[str] = '#1F77B4', markeredgewidth: Optional[Union[int, float]] = 1, fontsize: Optional[Union[int, float]] = 12, save_path: Optional[str] = None, display_plot: Optional[bool] = True, return_fig: Optional[bool] = False) Optional[matplotlib.figure.Figure][source]#

Visualizes the XGBoost feature importance as a bar chart.

Notes

This plotting function can be used along with feature_importance_ attribute of any of XGBoostClassifier, XGBoostCVClassifier, XGBoostRegressor, or XGBoostCVRegressor classes.

Parameters:
  • feature importance (pd.DataFrame) – Feature importance (feature_importance_ attribute)

  • figsize (tuple, optional) – Figure size, by default (8, 5)

  • color (str, optional) – Color of the horizontal lines of lollipops, by default “#87CEEB”

  • marker (str, optional) – Marker style of the lollipops. More valid marker styles can be found at [markers-api], by default “o”

  • markersize (Union[int, float], optional) – Markersize, by default 10

  • markeredgecolor (str, optional) – Marker edge color, by default “#1F77B4”

  • markerfacecolor (str, optional) – Marker face color, by defualt “#1F77B4”

  • markeredgewidth (Union[int, float], optional) – Marker edge width, by default 1

  • fontsize (Union[int, float], optional) – Fontsize for xlabel and ylabel, and ticks parameters, by default 12

  • save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None

  • display_plot (bool, optional) – Whether to show the plot, by default True

  • return_fig (bool, optional) – Whether to return figure object, by default False

References

Returns:

Figure, optional