slickml.optimization

Classes

XGBoostBayesianOptimizer

XGBoost Hyper-Parameters Tuner using Bayesian Optimization.

XGBoostHyperOptimizer

XGBoost Hyper-Parameters Tuner using HyperOpt Optimization.

Package Contents

class slickml.optimization.XGBoostBayesianOptimizer[source]

Bases: slickml.base.BaseXGBoostEstimator

XGBoost Hyper-Parameters Tuner using Bayesian Optimization.

This is wrapper using Bayesian Optimization algorithm [bayesian-optimization] to tune the hyper-parameter of XGBoost [xgboost-api] using xgboost.cv() functionality with n-folds cross-validation iteratively. This feature can be used to find the set of optimized set of hyper-parameters for both classification and regression tasks.

Notes

The optimizier objective is always to maximize the target values. Therefore, in case of using a metric such as logloss, error, mae, rmse, or rmsle, the negative value of the metric will be maximized. One of the big pitfall of the current implementation is the way we are sampling hyper-parameters from the params_bounds where we are looking for an integer which is not possible. Therefore, for some of cases i.e. max_depth we must cast the sampled value which is mathematically wrong (i.e. f(1.1) != f(1)).

Parameters:
  • n_iter (int, optional) – Number of iteration rounds for hyper-parameters tuning after initialization, by default 10

  • n_init_iter (int, optional) – Number of initial iterations to initialize the optimizer, by default 5

  • n_splits (int, optional) – Number of folds for cross-validation, by default 4

  • metrics (str, optional) – Metrics to be tracked at cross-validation fitting time depends on the task (classification vs regression) with possible values of “auc”, “aucpr”, “error”, “logloss”, “rmse”, “rmsle”, “mae”. Note this is different than eval_metric that needs to be passed to params dict, by default “auc”

  • objective (str, optional) – Objective function depending on the task whether it is regression or classification. Possible objectives for classification "binary:logistic" and for regression "reg:logistic", "reg:squarederror", and "reg:squaredlogerror", by default “binary:logistic”

  • acquisition_criterion (str, optional) – Acquisition criterion method with possible options of "ei" (Expected Improvement), "ucb" (Upper Confidence Bounds), and "poi" (Probability Of Improvement), by default “ei”

  • params_bounds (Dict[str, Tuple[Union[int, float], Union[int, float]]], optional) – Set of hyper-parameters boundaries for Bayesian Optimization where all fields are required, by default {“max_depth” : (2, 7), “learning_rate” : (0, 1), “min_child_weight” : (1, 20), “colsample_bytree”: (0.1, 1.0), “subsample” : (0.1, 1), “gamma” : (0, 1), “reg_alpha” : (0, 1), “reg_lambda” : (0, 1)}

  • num_boost_round (int, optional) – Number of boosting rounds to fit a model, by default 200

  • early_stopping_rounds (int, optional) – The criterion to early abort the xgboost.cv() phase if the test metric is not improved, by default 20

  • random_state (int, optional) – Random seed number, by default 1367

  • stratified (bool, optional) – Whether to use stratificaiton of the targets (only available for classification tasks) to run xgboost.cv() to find the best number of boosting round at each fold of each iteration, by default True

  • shuffle (bool, optional) – Whether to shuffle data to have the ability of building stratified folds in xgboost.cv(), by default True

  • sparse_matrix (bool, optional) – Whether to convert the input features to sparse matrix with csr format or not. This would increase the speed of feature selection for relatively large/sparse datasets. Consequently, this would actually act like an un-optimize solution for dense feature matrix. Additionally, this parameter cannot be used along with scale_mean=True standardizing the feature matrix to have a mean value of zeros would turn the feature matrix into a dense matrix. Therefore, by default our API banned this feature, by default False

  • scale_mean (bool, optional) – Whether to standarize the feauture matrix to have a mean value of zero per feature (center the features before scaling). As laid out in sparse_matrix, scale_mean=False when using sparse_matrix=True, since centering the feature matrix would decrease the sparsity and in practice it does not make any sense to use sparse matrix method and it would make it worse. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or scale_strd is used unless it is None, by default False

  • scale_std (bool, optional) – Whether to scale the feauture matrix to have unit variance (or equivalently, unit standard deviation) per feature. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or scale_strd is used unless it is None, by default False

  • importance_type (str, optional) – Importance type of xgboost.train() with possible values "weight", "gain", "total_gain", "cover", "total_cover", by default “total_gain”

  • verbose (bool, optional) – Whether to show the Bayesian Optimization progress at each iteration, by default True

fit(X, y)[source]

Fits the Bayesian optimization algorithm to tune the hyper-parameters

get_optimizer()[source]

Returns the fitted Bayesian Optimiziation object

get_results()[source]

Returns all the optimization results including target and params

get_best_results()[source]

Return the results based on the best (tuned) hyper-parameters

get_best_params()[source]

Returns the tuned hyper-parameters as a dictionary

get_params_bounds()[source]

Returns the parameters boundaries

optimizer_

Returns the fitted Bayesian Optimiziation object

results_

Returns all the optimization results including target and params

best_params_

Returns the tuned hyper-parameters as a dictionary

best_results_

Return the results based on the best (tuned) hyper-parameters

References

__getstate__()
classmethod __init_subclass__(**kwargs)

Set the set_{method}_request methods.

This uses PEP-487 [1]_ to set the set_{method}_request methods. It looks for the information available in the set default values which are set using __metadata_request__* class attributes, or inferred from method signatures.

The __metadata_request__* class attributes are used when a method does not explicitly accept a metadata through its arguments or if the developer would like to specify a request value for those metadata which are different from the default None.

References

__post_init__() None[source]

Post instantiation validations and assignments.

__repr__(N_CHAR_MAX=700)

Return repr(self).

__setstate__(state)
__sklearn_clone__()
__slots__ = ()
acquisition_criterion: str | None = 'ei'
early_stopping_rounds: int | None = 20
fit(X: pandas.DataFrame | numpy.ndarray, y: List[float] | numpy.ndarray | pandas.Series) None[source]

Fits the main hyper-parameter tuning algorithm.

Notes

At each iteration, one set of parameters gets passed from the params_bounds and the evaluation occurs based on the cross-validation results. Bayesian optimizier always maximizes the objectives. Therefore, based on the metrics we should be careful when using self.metrics that are supposed to get minimized i.e. error. For those, we can maximize (-1) * metric. One of the big pitfall of the current implementation is the way we are sampling hyper-parameters from the params_bounds where we are looking for an integer which is not possible. Therefore, for some of cases i.e. max_depth we must cast the sampled value which is mathematically wrong (i.e. f(1.1) != f(1)).

Parameters:
  • X (Union[pd.DataFrame, np.ndarray]) – Input data for training (features)

  • y (Union[List[float], np.ndarray, pd.Series]) – Input ground truth for training (targets)

Returns:

None

get_best_params() Dict[str, str | float | int][source]

Returns the tuned results of the optimization as the best set of hyper-parameters.

Returns:

Dict[str, Union[str, float, int]]

get_best_results() pandas.DataFrame[source]

Returns the performance of the best (tuned) set of hyper-parameters.

Returns:

pd.DataFrame

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

get_optimizer() bayes_opt.BayesianOptimization[source]

Return the Bayesian Optimization object.

Returns:

BayesianOptimization

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

get_params_bounds() Dict[str, Tuple[int | float, int | float]] | None[source]

Returns the hyper-parameters boundaries for the tuning process.

Returns:

Dict[str, Tuple[Union[int, float], Union[int, float]]]

get_results() pandas.DataFrame[source]

Returns the hyper-parameter optimization results.

Returns:

pd.DataFrame

importance_type: str | None = 'total_gain'
metrics: str | None = 'auc'
n_init_iter: int | None = 5
n_iter: int | None = 10
n_splits: int | None = 4
num_boost_round: int | None = 200
objective: str | None = 'binary:logistic'
params: Dict[str, str | float | int] | None = None
params_bounds: Dict[str, Tuple[int | float, int | float]] | None = None
random_state: int | None = 1367
scale_mean: bool | None = False
scale_std: bool | None = False
set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self (estimator instance) – Estimator instance.

shuffle: bool | None = True
sparse_matrix: bool | None = False
stratified: bool | None = True
verbose: bool | None = True
class slickml.optimization.XGBoostHyperOptimizer[source]

Bases: slickml.base.BaseXGBoostEstimator

XGBoost Hyper-Parameters Tuner using HyperOpt Optimization.

This is wrapper using HyperOpt [hyperopt] a Python library for serial and parallel optimization over search spaces, which may include real-valued, discrete, and conditional dimensions to tune the hyper-parameter of XGBoost [xgboost-api] using xgboost.cv() functionality with n-folds cross-validation iteratively. This feature can be used to find the set of optimized set of hyper-parameters for both classification and regression tasks.

Notes

The optimizier objective is always to minimize the target values. Therefore, in case of using a metric such as auc, or aucpr the negative value of the metric will be minimized.

Parameters:
  • n_iter (int, optional) – Maximum number of iteration rounds for hyper-parameters tuning before convergance, by default 100

  • n_splits (int, optional) – Number of folds for cross-validation, by default 4

  • metrics (str, optional) – Metrics to be tracked at cross-validation fitting time depends on the task (classification vs regression) with possible values of “auc”, “aucpr”, “error”, “logloss”, “rmse”, “rmsle”, “mae”. Note this is different than eval_metric that needs to be passed to params dict, by default “auc”

  • objective (str, optional) – Objective function depending on the task whether it is regression or classification. Possible objectives for classification "binary:logistic" and for regression "reg:logistic", "reg:squarederror", and "reg:squaredlogerror", by default “binary:logistic”

  • params_bounds (Dict[str, Any], optional) – Set of hyper-parameters boundaries for HyperOpt using``hyperopt.hp`` and hyperopt.pyll_utils, by default {“max_depth” : (2, 7), “learning_rate” : (0, 1), “min_child_weight” : (1, 20), “colsample_bytree”: (0.1, 1.0), “subsample” : (0.1, 1), “gamma” : (0, 1), “reg_alpha” : (0, 1), “reg_lambda” : (0, 1)}

  • num_boost_round (int, optional) – Number of boosting rounds to fit a model, by default 200

  • early_stopping_rounds (int, optional) – The criterion to early abort the xgboost.cv() phase if the test metric is not improved, by default 20

  • random_state (int, optional) – Random seed number, by default 1367

  • stratified (bool, optional) – Whether to use stratificaiton of the targets (only available for classification tasks) to run xgboost.cv() to find the best number of boosting round at each fold of each iteration, by default True

  • shuffle (bool, optional) – Whether to shuffle data to have the ability of building stratified folds in xgboost.cv(), by default True

  • sparse_matrix (bool, optional) – Whether to convert the input features to sparse matrix with csr format or not. This would increase the speed of feature selection for relatively large/sparse datasets. Consequently, this would actually act like an un-optimize solution for dense feature matrix. Additionally, this parameter cannot be used along with scale_mean=True standardizing the feature matrix to have a mean value of zeros would turn the feature matrix into a dense matrix. Therefore, by default our API banned this feature, by default False

  • scale_mean (bool, optional) – Whether to standarize the feauture matrix to have a mean value of zero per feature (center the features before scaling). As laid out in sparse_matrix, scale_mean=False when using sparse_matrix=True, since centering the feature matrix would decrease the sparsity and in practice it does not make any sense to use sparse matrix method and it would make it worse. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or scale_strd is used unless it is None, by default False

  • scale_std (bool, optional) – Whether to scale the feauture matrix to have unit variance (or equivalently, unit standard deviation) per feature. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or scale_strd is used unless it is None, by default False

  • importance_type (str, optional) – Importance type of xgboost.train() with possible values "weight", "gain", "total_gain", "cover", "total_cover", by default “total_gain”

  • verbose (bool, optional) – Whether to show the HyperOpt Optimization progress at each iteration, by default True

fit(X, y)[source]

Fits the HyperOpt optimization algorithm to tune the hyper-parameters

get_best_params()[source]

Returns the tuned hyper-parameters as a dictionary

get_results()[source]

Returns all the optimization trials

get_trials()[source]

Return the trials object

get_params_bounds()[source]

Returns the parameters boundaries

best_params_

Returns the tuned hyper-parameters as a dictionary

results_

Returns all the optimization trials as results

References

__getstate__()
classmethod __init_subclass__(**kwargs)

Set the set_{method}_request methods.

This uses PEP-487 [1]_ to set the set_{method}_request methods. It looks for the information available in the set default values which are set using __metadata_request__* class attributes, or inferred from method signatures.

The __metadata_request__* class attributes are used when a method does not explicitly accept a metadata through its arguments or if the developer would like to specify a request value for those metadata which are different from the default None.

References

__post_init__() None[source]

Post instantiation validations and assignments.

__repr__(N_CHAR_MAX=700)

Return repr(self).

__setstate__(state)
__sklearn_clone__()
__slots__ = ()
early_stopping_rounds: int | None = 20
fit(X: pandas.DataFrame | numpy.ndarray, y: List[float] | numpy.ndarray | pandas.Series) None[source]

Fits the main hyper-parameter tuning algorithm.

Notes

At each iteration, one set of parameters gets passed from the params_bounds and the evaluation occurs based on the cross-validation results. Hyper optimizier always minimizes the objectives. Therefore, based on the metrics we should be careful when using self.metrics that are supposed to get maximized i.e. auc. For those, we can maximize (-1) * metric.

Parameters:
  • X (Union[pd.DataFrame, np.ndarray]) – Input data for training (features)

  • y (Union[List[float], np.ndarray, pd.Series]) – Input ground truth for training (targets)

Returns:

None

get_best_params() Dict[str, str | float | int][source]

Returns the tuned results of the optimization as the best set of hyper-parameters.

Returns:

Dict[str, Union[str, float, int]]

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

get_params_bounds() Dict[str, Any] | None[source]

Returns the hyper-parameters boundaries for the tuning process.

Returns:

Dict[str, Any]

get_results() List[Dict[str, Any]][source]

Return all trials results.

Returns:

List[Dict[str, Any]]

get_trials() hyperopt.Trials[source]

Returns the Trials object passed to the optimizer.

Returns:

hyperopt.Trials

importance_type: str | None = 'total_gain'
metrics: str | None = 'auc'
n_iter: int | None = 100
n_splits: int | None = 4
num_boost_round: int | None = 200
objective: str | None = 'binary:logistic'
params: Dict[str, str | float | int] | None = None
params_bounds: Dict[str, Any] | None = None
random_state: int | None = 1367
scale_mean: bool | None = False
scale_std: bool | None = False
set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self (estimator instance) – Estimator instance.

shuffle: bool | None = True
sparse_matrix: bool | None = False
stratified: bool | None = True
verbose: bool | None = True