slickml.metrics¶
Classes¶
BinaryClassificationMetrics calculates binary classification metrics in one place. |
|
Regression Metrics is a wrapper to calculate all the regression metrics in one place. |
Package Contents¶
- class slickml.metrics.BinaryClassificationMetrics[source]¶
BinaryClassificationMetrics calculates binary classification metrics in one place.
Binary metrics are computed based on three methods for calculating the thresholds to binarize the prediction probabilities. Threshold computations including:
Youden Index [youden-j-index].
Maximizing Precision-Recall.
Maximizing Sensitivity-Specificity.
- Parameters:
y_true (Union[List[int], np.ndarray, pd.Series]) – List of ground truth values such as [0, 1] for binary problems
y_pred_proba (Union[List[float], np.ndarray, pd.Series]) – List of predicted probabilities for the positive class (class=1) in binary problems or
y_pred_proba[:, 1]
in scikit-learn APIthreshold (float, optional) – Inclusive threshold value to binarize
y_pred_prob
toy_pred
where any value that satisfiesy_pred_prob >= threshold
will set toclass=1 (positive class)
. Note that for">="
is used instead of">"
, by default 0.5average_method (str, optional) – Method to calculate the average of any metric. Possible values are
"micro"
,"macro"
,"weighted"
,"binary"
, by default “binary”precision_digits (int, optional) – The number of precision digits to format the scores dataframe, by default 3
display_df (bool, optional) – Whether to display the formatted scores’ dataframe, by default True
- plot(figsize=(12, 12), save_path=None, display_plot=False, return_fig=False)[source]¶
Plots classification metrics
- y_pred_¶
Predicted class based on the
threshold
. The threshold value inclusively binarizesy_pred_prob
toy_pred
where any value that satisfiesy_pred_prob >= threshold
will set toclass=1 (positive class)
. Note that for">="
is used instead of">"
- Type:
np.ndarray
- accuracy_¶
Accuracy based on the initial
threshold
value with a possible value between 0.0 and 1.0- Type:
- balanced_accuracy_¶
Balanced accuracy based on the initial
threshold
value considering the prevalence of the classes with a possible value between 0.0 and 1.0- Type:
- fpr_list_¶
List of calculated false-positive-rates based on
roc_thresholds_
- Type:
np.ndarray
- tpr_list_¶
List of calculated true-positive-rates based on
roc_thresholds_
- Type:
np.ndarray
- roc_thresholds_¶
List of thresholds value to calculate
fpr_list_
andtpr_list_
- Type:
np.ndarray
- precision_list_¶
List of calculated precision based on
pr_thresholds_
- Type:
np.ndarray
- recall_list_¶
List of calculated recall based on
pr_thresholds_
- Type:
np.ndarray
- pr_thresholds_¶
List of precision-recall thresholds value to calculate
precision_list_
andrecall_list_
- Type:
numpy.ndarray
- precision_¶
Precision based on the
threshold
value with a possible value between 0.0 and 1.0- Type:
- f1_¶
F1-score based on the
threshold
value (beta=1.0) with a possible value between 0.0 and 1.0- Type:
- f2_¶
F2-score based on the
threshold
value (beta=2.0) with a possible value between 0.0 and 1.0- Type:
- f05_¶
F(1/2)-score based on the
threshold
value (beta=0.5) with a possible value between 0.0 and 1.0- Type:
- average_precision_¶
Avearge precision based on the
threshold
value and class prevalence with a possible value between 0.0 and 1.0- Type:
- tn_¶
True negative counts based on the
threshold
value- Type:
np.int64
- fp_¶
False positive counts based on the
threshold
valuee- Type:
np.int64
- fn_¶
False negative counts based on the
threshold
value- Type:
np.int64
- tp_¶
True positive counts based on the
threshold
value- Type:
np.int64
- threat_score_¶
Threat score based on the
threshold
value with a possible value between 0.0 and 1.0- Type:
- youden_index_¶
Index of the calculated Youden index threshold
- Type:
np.int64
- youden_threshold_¶
Threshold calculated based on Youden Index with a possible value between 0.0 and 1.0
- Type:
- sens_spec_threshold_¶
Threshold calculated based on maximized sensitivity-specificity with a possible value between 0.0 and 1.0
- Type:
- prec_rec_threshold_¶
Threshold calculated based on maximized precision-recall with a possible value between 0.0 and 1.0
- Type:
- thresholds_dict_¶
Calculated thresholds based on different algorithms including Youden Index
youden_threshold_
, maximizing the area under sensitivity-specificity curvesens_spec_threshold_
, and maximizing the area under precision-recall curverprec_rec_threshold_
- metrics_df_¶
Pandas DataFrame of all calculated metrics with
threshold
set as index- Type:
pd.DataFrame
References
Examples
>>> from slickml.metrics import BinaryClassificationMetrics >>> cm = BinaryClassificationMetrics( ... y_true=[1, 1, 0, 0], ... y_pred_proba=[0.95, 0.3, 0.1, 0.9] ... ) >>> f = cm.plot() >>> m = cm.get_metrics()
- get_metrics(dtype: str | None = 'dataframe') pandas.DataFrame | Dict[str, float | None] [source]¶
Returns calculated metrics with desired dtypes.
Currently, available output types are “dataframe” and “dict”.
- Parameters:
dtype (str, optional) – Results dtype, by default “dataframe”
- Returns:
Union[pd.DataFrame, Dict[str, Optional[float]]]
- plot(figsize: Tuple[float, float] | None = (12, 12), save_path: str | None = None, display_plot: bool | None = False, return_fig: bool | None = False) matplotlib.figure.Figure | None [source]¶
Plots classification metrics.
- Parameters:
figsize (Tuple[float, float], optional) – Figure size, by default (12, 12)
save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None
display_plot (bool, optional) – Whether to show the plot, by default False
return_fig (bool, optional) – Whether to return figure object, by default False
- Returns:
Figure
- class slickml.metrics.RegressionMetrics[source]¶
Regression Metrics is a wrapper to calculate all the regression metrics in one place.
Notes
In case of multioutput regression, calculation methods can be chosen among
"raw_values"
,"uniform_average"
, and"variance_weighted"
.- Parameters:
y_true (Union[List[float], np.ndarray, pd.Series]) – Ground truth target (response) values
y_pred (Union[List[float], np.ndarray, pd.Series]) – Predicted target (response) values
multioutput (str, optional) – Method to calculate the metric for
multioutput targets
where possible values are"raw_values"
,"uniform_average"
, and"variance_weighted"
."raw_values"
returns a full set of scores in case of multioutput input."uniform_average"
scores of all outputs are averaged with uniform weight."variance_weighted"
scores of all outputs are averaged, weighted by the variances of each individual output, by default “uniform_average”precision_digits (int, optional) – The number of precision digits to format the scores dataframe, by default 3
display_df (bool, optional) – Whether to display the formatted scores’ dataframe, by default True
- plot(figsize=(12, 16), save_path=None, display_plot=False, return_fig=False)[source]¶
Plots regression metrics
- y_residual_¶
Residual values (errors) calculated as
(y_true - y_pred)
- Type:
np.ndarray
- y_residual_normsq_¶
Square root of absolute value of
y_residual_
- Type:
np.ndarray
- r2_¶
\(R^2\) score (coefficient of determination) with a possible value between 0.0 and 1.0
- Type:
- deviation_¶
Arranged deviations to plot REC curve
- Type:
np.ndarray
- accuracy_¶
Calculated accuracy at each deviation to plot REC curve
- Type:
np.ndarray
- y_ratio_¶
Ratio of
y_pred/y_true
- Type:
np.ndarray
- metrics_dict_¶
Rounded metrics based on the number of precision digits
- metrics_df_¶
Pandas DataFrame of all calculated metrics
- Type:
pd.DataFrame
References
[Tahmassebi-et-al]Tahmassebi, A., Gandomi, A. H., & Meyer-Baese, A. (2018, July). A Pareto front based evolutionary model for airfoil self-noise prediction. In 2018 IEEE Congress on Evolutionary Computation (CEC) (pp. 1-8). IEEE. https://www.amirhessam.com/assets/pdf/projects/cec-airfoil2018.pdf
[rec-curve]Bi, J., & Bennett, K. P. (2003). Regression error characteristic curves. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 43-50). https://www.aaai.org/Papers/ICML/2003/ICML03-009.pdf
Examples
>>> from slickml.metrics import RegressionMetrics >>> rm = RegressionMetrics( ... y_true=[3, -0.5, 2, 7], ... y_pred=[2.5, 0.0, 2, 8] ... ) >>> m = rm.get_metrics() >>> rm.plot()
- get_metrics(dtype: str | None = 'dataframe') pandas.DataFrame | Dict[str, float | None] [source]¶
Returns calculated metrics with desired dtypes.
Currently, available output types are
"dataframe"
and"dict"
.- Parameters:
dtype (str, optional) – Results dtype, by default “dataframe”
- Returns:
Union[pd.DataFrame, Dict[str, Optional[float]]]
- plot(figsize: Tuple[float, float] | None = (12, 16), save_path: str | None = None, display_plot: bool | None = False, return_fig: bool | None = False) matplotlib.figure.Figure | None [source]¶
Plots regression metrics.
- Parameters:
figsize (Tuple[float, float], optional) – Figure size, by default (12, 16)
save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None
display_plot (bool, optional) – Whether to show the plot, by default False
return_fig (bool, optional) – Whether to return figure object, by default False
- Returns:
Figure, optional