This is an abstractbaseclass using XGBoost [xgboost-api] that can be used for any estimator using
XGBoost as the base estimator such as XGBoostCVClassifier, XGBoostRegressor,
XGBoostFeatureSelector, XGBoostBayesianOptimizer, and so on. This base estimator comes
with the base validation utilities that can reduce the amount of copy/paste codes in the
downstream classes.
Parameters:
num_boost_round (int) – Number of boosting rounds to fit a model
sparse_matrix (bool) – Whether to convert the input features to sparse matrix with csr format or not. This would
increase the speed of feature selection for relatively large/sparse datasets. Consequently,
this would actually act like an un-optimize solution for dense feature matrix. Additionally,
this parameter cannot be used along with scale_mean=True standardizing the feature matrix
to have a mean value of zeros would turn the feature matrix into a dense matrix. Therefore,
by default our API banned this feature
scale_mean (bool) – Whether to standarize the feauture matrix to have a mean value of zero per feature (center
the features before scaling). As laid out in sparse_matrix, scale_mean=False when
using sparse_matrix=True, since centering the feature matrix would decrease the sparsity
and in practice it does not make any sense to use sparse matrix method and it would make
it worse. The StandardScaler object can be accessed via cls.scaler_ if scale_mean or
scale_strd is used unless it is None
scale_std (bool) – Whether to scale the feauture matrix to have unit variance (or equivalently, unit standard
deviation) per feature. The StandardScaler object can be accessed via cls.scaler_
if scale_mean or scale_strd is used unless it is None
importance_type (str) – Importance type of xgboost.train() with possible values "weight", "gain",
"total_gain", "cover", "total_cover"
params (Dict[str, Union[str, float, int]], optional) – Set of parameters required for fitting a Booster
This uses PEP-487 [1]_ to set the set_{method}_request methods. It
looks for the information available in the set default values which are
set using __metadata_request__* class attributes, or inferred
from method signatures.
The __metadata_request__* class attributes are used when a method
does not explicitly accept a metadata through its arguments or if the
developer would like to specify a request value for those metadata
which are different from the default None.
The method works on simple estimators as well as on nested objects
(such as Pipeline). The latter have
parameters of the form <component>__<parameter> so that it’s
possible to update each component of a nested object.
The main reason of this protocol is proper duck typing (PEP-544) [1]_ when using metrics such as
RegressionMetrics or ClassificationMetrics in pipelines.