slickml.optimization#

Package Contents#

Classes#

XGBoostBayesianOptimizer

XGBoost Hyper-Parameters Tuner using Bayesian Optimization.

XGBoostHyperOptimizer

XGBoost Hyper-Parameters Tuner using HyperOpt Optimization.

class slickml.optimization.XGBoostBayesianOptimizer[source]#

Bases: slickml.base.BaseXGBoostEstimator

XGBoost Hyper-Parameters Tuner using Bayesian Optimization.

This is wrapper using Bayesian Optimization algorithm [bayesian-optimization] to tune the hyper-parameter of XGBoost [xgboost-api] using xgboost.cv() functionality with n-folds cross-validation iteratively. This feature can be used to find the set of optimized set of hyper-parameters for both classification and regression tasks.

Notes

The optimizier objective is always to maximize the target values. Therefore, in case of using a metric such as logloss, error, mae, rmse, or rmsle, the negative value of the metric will be maximized. One of the big pitfall of the current implementation is the way we are sampling hyper-parameters from the params_bounds where we are looking for an integer which is not possible. Therefore, for some of cases i.e. max_depth we must cast the sampled value which is mathematically wrong (i.e. f(1.1) != f(1)).

Parameters:
  • n_iter (int, optional) – Number of iteration rounds for hyper-parameters tuning after initialization, by default 10

  • n_init_iter (int, optional) – Number of initial iterations to initialize the optimizer, by default 5

  • n_splits (int, optional) – Number of folds for cross-validation, by default 4

  • metrics (str, optional) – Metrics to be tracked at cross-validation fitting time depends on the task (classification vs regression) with possible values of “auc”, “aucpr”, “error”, “logloss”, “rmse”, “rmsle”, “mae”. Note this is different than eval_metric that needs to be passed to params dict, by default “auc”

  • objective (str, optional) – Objective function depending on the task whether it is regression or classification. Possible objectives for classification "binary:logistic" and for regression "reg:logistic", "reg:squarederror", and "reg:squaredlogerror", by default “binary:logistic”

  • acquisition_criterion (str, optional) – Acquisition criterion method with possible options of "ei" (Expected Improvement), "ucb" (Upper Confidence Bounds), and "poi" (Probability Of Improvement), by default “ei”

  • params_bounds (Dict[str, Tuple[Union[int, float], Union[int, float]]], optional) – Set of hyper-parameters boundaries for Bayesian Optimization where all fields are required, by default {“max_depth” : (2, 7), “learning_rate” : (0, 1), “min_child_weight” : (1, 20), “colsample_bytree”: (0.1, 1.0), “subsample” : (0.1, 1), “gamma” : (0, 1), “reg_alpha” : (0, 1), “reg_lambda” : (0, 1)}

  • num_boost_round (int, optional) – Number of boosting rounds to fit a model, by default 200

  • early_stopping_rounds (int, optional) – The criterion to early abort the xgboost.cv() phase if the test metric is not improved, by default 20

  • random_state (int, optional) – Random seed number, by default 1367

  • stratified (bool, optional) – Whether to use stratificaiton of the targets (only available for classification tasks) to run xgboost.cv() to find the best number of boosting round at each fold of each iteration, by default True

  • shuffle (bool, optional) – Whether to shuffle data to have the ability of building stratified folds in xgboost.cv(), by default True

  • sparse_matrix (bool, optional) – Whether to convert the input features to sparse matrix with csr format or not. This would increase the speed of feature selection for relatively large/sparse datasets. Consequently, this would actually act like an un-optimize solution for dense feature matrix. Additionally, this parameter cannot be used along with scale_mean=True standardizing the feature matrix to have a mean value of zeros would turn the feature matrix into a dense matrix. Therefore, by default our API banned this feature, by default False

  • scale_mean (bool, optional) – Whether to standarize the feauture matrix to have a mean value of zero per feature (center the features before scaling). As laid out in sparse_matrix, scale_mean=False when using sparse_matrix=True, since centering the feature matrix would decrease the sparsity and in practice it does not make any sense to use sparse matrix method and it would make it worse. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or scale_strd is used unless it is None, by default False

  • scale_std (bool, optional) – Whether to scale the feauture matrix to have unit variance (or equivalently, unit standard deviation) per feature. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or scale_strd is used unless it is None, by default False

  • importance_type (str, optional) – Importance type of xgboost.train() with possible values "weight", "gain", "total_gain", "cover", "total_cover", by default “total_gain”

  • verbose (bool, optional) – Whether to show the Bayesian Optimization progress at each iteration, by default True

fit(X, y)[source]#

Fits the Bayesian optimization algorithm to tune the hyper-parameters

get_optimizer()[source]#

Returns the fitted Bayesian Optimiziation object

get_results()[source]#

Returns all the optimization results including target and params

get_best_results()[source]#

Return the results based on the best (tuned) hyper-parameters

get_best_params()[source]#

Returns the tuned hyper-parameters as a dictionary

get_params_bounds()[source]#

Returns the parameters boundaries

optimizer_#

Returns the fitted Bayesian Optimiziation object

results_#

Returns all the optimization results including target and params

best_params_#

Returns the tuned hyper-parameters as a dictionary

best_results_#

Return the results based on the best (tuned) hyper-parameters

References

__slots__ = []#
acquisition_criterion :Optional[str] = ei#
early_stopping_rounds :Optional[int] = 20#
importance_type :Optional[str] = total_gain#
metrics :Optional[str] = auc#
n_init_iter :Optional[int] = 5#
n_iter :Optional[int] = 10#
n_splits :Optional[int] = 4#
num_boost_round :Optional[int] = 200#
objective :Optional[str] = binary:logistic#
params :Optional[Dict[str, Union[str, float, int]]]#
params_bounds :Optional[Dict[str, Tuple[Union[int, float], Union[int, float]]]]#
random_state :Optional[int] = 1367#
scale_mean :Optional[bool] = False#
scale_std :Optional[bool] = False#
shuffle :Optional[bool] = True#
sparse_matrix :Optional[bool] = False#
stratified :Optional[bool] = True#
verbose :Optional[bool] = True#
__getstate__()#
__post_init__() None[source]#

Post instantiation validations and assignments.

__repr__(N_CHAR_MAX=700)#

Return repr(self).

__setstate__(state)#
fit(X: Union[pandas.DataFrame, numpy.ndarray], y: Union[List[float], numpy.ndarray, pandas.Series]) None[source]#

Fits the main hyper-parameter tuning algorithm.

Notes

At each iteration, one set of parameters gets passed from the params_bounds and the evaluation occurs based on the cross-validation results. Bayesian optimizier always maximizes the objectives. Therefore, based on the metrics we should be careful when using self.metrics that are supposed to get minimized i.e. error. For those, we can maximize (-1) * metric. One of the big pitfall of the current implementation is the way we are sampling hyper-parameters from the params_bounds where we are looking for an integer which is not possible. Therefore, for some of cases i.e. max_depth we must cast the sampled value which is mathematically wrong (i.e. f(1.1) != f(1)).

Parameters:
  • X (Union[pd.DataFrame, np.ndarray]) – Input data for training (features)

  • y (Union[List[float], np.ndarray, pd.Series]) – Input ground truth for training (targets)

Returns:

None

get_best_params() Dict[str, Union[str, float, int]][source]#

Returns the tuned results of the optimization as the best set of hyper-parameters.

Returns:

Dict[str, Union[str, float, int]]

get_best_results() pandas.DataFrame[source]#

Returns the performance of the best (tuned) set of hyper-parameters.

Returns:

pd.DataFrame

get_optimizer() bayes_opt.BayesianOptimization[source]#

Return the Bayesian Optimization object.

Returns:

BayesianOptimization

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

get_params_bounds() Optional[Dict[str, Tuple[Union[int, float], Union[int, float]]]][source]#

Returns the hyper-parameters boundaries for the tuning process.

Returns:

Dict[str, Tuple[Union[int, float], Union[int, float]]]

get_results() pandas.DataFrame[source]#

Returns the hyper-parameter optimization results.

Returns:

pd.DataFrame

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self (estimator instance) – Estimator instance.

class slickml.optimization.XGBoostHyperOptimizer[source]#

Bases: slickml.base.BaseXGBoostEstimator

XGBoost Hyper-Parameters Tuner using HyperOpt Optimization.

This is wrapper using HyperOpt [hyperopt] a Python library for serial and parallel optimization over search spaces, which may include real-valued, discrete, and conditional dimensions to tune the hyper-parameter of XGBoost [xgboost-api] using xgboost.cv() functionality with n-folds cross-validation iteratively. This feature can be used to find the set of optimized set of hyper-parameters for both classification and regression tasks.

Notes

The optimizier objective is always to minimize the target values. Therefore, in case of using a metric such as auc, or aucpr the negative value of the metric will be minimized.

Parameters:
  • n_iter (int, optional) – Maximum number of iteration rounds for hyper-parameters tuning before convergance, by default 100

  • n_splits (int, optional) – Number of folds for cross-validation, by default 4

  • metrics (str, optional) – Metrics to be tracked at cross-validation fitting time depends on the task (classification vs regression) with possible values of “auc”, “aucpr”, “error”, “logloss”, “rmse”, “rmsle”, “mae”. Note this is different than eval_metric that needs to be passed to params dict, by default “auc”

  • objective (str, optional) – Objective function depending on the task whether it is regression or classification. Possible objectives for classification "binary:logistic" and for regression "reg:logistic", "reg:squarederror", and "reg:squaredlogerror", by default “binary:logistic”

  • params_bounds (Dict[str, Any], optional) – Set of hyper-parameters boundaries for HyperOpt using``hyperopt.hp`` and hyperopt.pyll_utils, by default {“max_depth” : (2, 7), “learning_rate” : (0, 1), “min_child_weight” : (1, 20), “colsample_bytree”: (0.1, 1.0), “subsample” : (0.1, 1), “gamma” : (0, 1), “reg_alpha” : (0, 1), “reg_lambda” : (0, 1)}

  • num_boost_round (int, optional) – Number of boosting rounds to fit a model, by default 200

  • early_stopping_rounds (int, optional) – The criterion to early abort the xgboost.cv() phase if the test metric is not improved, by default 20

  • random_state (int, optional) – Random seed number, by default 1367

  • stratified (bool, optional) – Whether to use stratificaiton of the targets (only available for classification tasks) to run xgboost.cv() to find the best number of boosting round at each fold of each iteration, by default True

  • shuffle (bool, optional) – Whether to shuffle data to have the ability of building stratified folds in xgboost.cv(), by default True

  • sparse_matrix (bool, optional) – Whether to convert the input features to sparse matrix with csr format or not. This would increase the speed of feature selection for relatively large/sparse datasets. Consequently, this would actually act like an un-optimize solution for dense feature matrix. Additionally, this parameter cannot be used along with scale_mean=True standardizing the feature matrix to have a mean value of zeros would turn the feature matrix into a dense matrix. Therefore, by default our API banned this feature, by default False

  • scale_mean (bool, optional) – Whether to standarize the feauture matrix to have a mean value of zero per feature (center the features before scaling). As laid out in sparse_matrix, scale_mean=False when using sparse_matrix=True, since centering the feature matrix would decrease the sparsity and in practice it does not make any sense to use sparse matrix method and it would make it worse. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or scale_strd is used unless it is None, by default False

  • scale_std (bool, optional) – Whether to scale the feauture matrix to have unit variance (or equivalently, unit standard deviation) per feature. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or scale_strd is used unless it is None, by default False

  • importance_type (str, optional) – Importance type of xgboost.train() with possible values "weight", "gain", "total_gain", "cover", "total_cover", by default “total_gain”

  • verbose (bool, optional) – Whether to show the HyperOpt Optimization progress at each iteration, by default True

fit(X, y)[source]#

Fits the HyperOpt optimization algorithm to tune the hyper-parameters

get_best_params()[source]#

Returns the tuned hyper-parameters as a dictionary

get_results()[source]#

Returns all the optimization trials

get_trials()[source]#

Return the trials object

get_params_bounds()[source]#

Returns the parameters boundaries

best_params_#

Returns the tuned hyper-parameters as a dictionary

results_#

Returns all the optimization trials as results

References

__slots__ = []#
early_stopping_rounds :Optional[int] = 20#
importance_type :Optional[str] = total_gain#
metrics :Optional[str] = auc#
n_iter :Optional[int] = 100#
n_splits :Optional[int] = 4#
num_boost_round :Optional[int] = 200#
objective :Optional[str] = binary:logistic#
params :Optional[Dict[str, Union[str, float, int]]]#
params_bounds :Optional[Dict[str, Any]]#
random_state :Optional[int] = 1367#
scale_mean :Optional[bool] = False#
scale_std :Optional[bool] = False#
shuffle :Optional[bool] = True#
sparse_matrix :Optional[bool] = False#
stratified :Optional[bool] = True#
verbose :Optional[bool] = True#
__getstate__()#
__post_init__() None[source]#

Post instantiation validations and assignments.

__repr__(N_CHAR_MAX=700)#

Return repr(self).

__setstate__(state)#
fit(X: Union[pandas.DataFrame, numpy.ndarray], y: Union[List[float], numpy.ndarray, pandas.Series]) None[source]#

Fits the main hyper-parameter tuning algorithm.

Notes

At each iteration, one set of parameters gets passed from the params_bounds and the evaluation occurs based on the cross-validation results. Hyper optimizier always minimizes the objectives. Therefore, based on the metrics we should be careful when using self.metrics that are supposed to get maximized i.e. auc. For those, we can maximize (-1) * metric.

Parameters:
  • X (Union[pd.DataFrame, np.ndarray]) – Input data for training (features)

  • y (Union[List[float], np.ndarray, pd.Series]) – Input ground truth for training (targets)

Returns:

None

get_best_params() Dict[str, Union[str, float, int]][source]#

Returns the tuned results of the optimization as the best set of hyper-parameters.

Returns:

Dict[str, Union[str, float, int]]

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

get_params_bounds() Optional[Dict[str, Any]][source]#

Returns the hyper-parameters boundaries for the tuning process.

Returns:

Dict[str, Any]

get_results() List[Dict[str, Any]][source]#

Return all trials results.

Returns:

List[Dict[str, Any]]

get_trials() hyperopt.Trials[source]#

Returns the Trials object passed to the optimizer.

Returns:

hyperopt.Trials

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self (estimator instance) – Estimator instance.