slickml.base
#
Package Contents#
Classes#
Base Estimator for XGBoost. |
|
Base Enum type with compatible string functionalities. |
|
Protocol for Metrics. |
- class slickml.base.BaseXGBoostEstimator[source]#
Bases:
abc.ABC
,sklearn.base.BaseEstimator
Base Estimator for XGBoost.
Notes
This is an abstractbaseclass using XGBoost [xgboost-api] that can be used for any estimator using XGBoost as the base estimator such as
XGBoostCVClassifier
,XGBoostRegressor
,XGBoostFeatureSelector
,XGBoostBayesianOptimizer
, and so on. This base estimator comes with the base validation utilities that can reduce the amount of copy/paste codes in the downstream classes.- Parameters:
num_boost_round (int) – Number of boosting rounds to fit a model
sparse_matrix (bool) – Whether to convert the input features to sparse matrix with csr format or not. This would increase the speed of feature selection for relatively large/sparse datasets. Consequently, this would actually act like an un-optimize solution for dense feature matrix. Additionally, this parameter cannot be used along with
scale_mean=True
standardizing the feature matrix to have a mean value of zeros would turn the feature matrix into a dense matrix. Therefore, by default our API banned this featurescale_mean (bool) – Whether to standarize the feauture matrix to have a mean value of zero per feature (center the features before scaling). As laid out in
sparse_matrix
,scale_mean=False
when usingsparse_matrix=True
, since centering the feature matrix would decrease the sparsity and in practice it does not make any sense to use sparse matrix method and it would make it worse. TheStandardScaler
object can be accessed viacls.scaler_
ifscale_mean
orscale_strd
is used unless it isNone
scale_std (bool) – Whether to scale the feauture matrix to have unit variance (or equivalently, unit standard deviation) per feature. The
StandardScaler
object can be accessed viacls.scaler_
ifscale_mean
orscale_strd
is used unless it isNone
importance_type (str) – Importance type of
xgboost.train()
with possible values"weight"
,"gain"
,"total_gain"
,"cover"
,"total_cover"
params (Dict[str, Union[str, float, int]], optional) – Set of parameters required for fitting a Booster
References
- __slots__ = []#
- importance_type :Optional[str]#
- num_boost_round :Optional[int]#
- params :Optional[Dict[str, Union[str, float, int]]]#
- scale_mean :Optional[bool]#
- scale_std :Optional[bool]#
- sparse_matrix :Optional[bool]#
- __getstate__()#
- __repr__(N_CHAR_MAX=700)#
Return repr(self).
- __setstate__(state)#
- abstract fit(X: Union[pandas.DataFrame, numpy.ndarray], y: Union[List[float], numpy.ndarray, pandas.Series]) None [source]#
Abstractmethod to fit a model to the features/targets depends on the task.
- Parameters:
X (Union[pd.DataFrame, np.ndarray]) – Input data for training (features)
y (Union[List[float], np.ndarray, pd.Series]) – Input ground truth for training (targets)
- Returns:
None
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params (dict) – Parameter names mapped to their values.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self (estimator instance) – Estimator instance.
- class slickml.base.ExtendedEnum[source]#
Bases:
enum.Enum
Base Enum type with compatible string functionalities.
Examples
>>> from slickml.utils import ExtendedEnum >>> class FooBar(ExtendedEnum): ... FOO = "foo" ... BAR = "bar" >>> FooBar.FOO >>> FooBar.names() >>> FooBar.values() >>> FooBar.to_dict()
- __dir__()#
Returns all members and all public methods
- __format__(format_spec)#
Returns format using actual value type unless __str__ has been overridden.
- __hash__()#
Return hash(self).
- __reduce_ex__(proto)#
Helper for pickle.
- name()#
The name of the Enum member.
- classmethod to_dict() Dict[str, str] [source]#
Returns a dictionary of all Enum name-value pairs as string.
- Returns:
Dict[str, str]
- value()#
The value of the Enum member.
- class slickml.base.Metrics[source]#
Bases:
Protocol
Protocol for Metrics.
Notes
The main reason of this protocol is proper duck typing (PEP-544) [1] when using metrics such as
RegressionMetrics
orClassificationMetrics
in pipelines.References
- __slots__ = []#
- classmethod __class_getitem__(params)#
- classmethod __init_subclass__(*args, **kwargs)#