stepsel.binning

The stepsel.binning module includes functions for binning data.

Submodules

Functions

bin_values(→ numpy.ndarray)

Bin data into bins based on thresholds.

get_tree_cut_points(clf[, feature_names])

Get the cut points of a decision tree.

Package Contents

stepsel.binning.bin_values(data: numpy.typing.ArrayLike, thresholds: numpy.typing.ArrayLike, right=True) numpy.ndarray[source]

Bin data into bins based on thresholds.

Parameters:
  • data (array-like) – The input values to be binned.

  • thresholds (array-like) – The thresholds to use for binning, ordered from smallest to largest.

  • right (bool, optional) – Whether the intervals should be closed on the right (default) or left.

Returns:

binned_values – The binned values. String format is “(a, b]” if right=True, “[a, b)” if right=False.

Return type:

array-like

stepsel.binning.get_tree_cut_points(clf: sklearn.tree.DecisionTreeRegressor | sklearn.tree.DecisionTreeClassifier, feature_names: numpy.typing.ArrayLike | None = None)[source]

Get the cut points of a decision tree.

Parameters:
  • clf (DecisionTreeRegressor or DecisionTreeClassifier) – The decision tree to get the cut points from.

  • feature_names (array-like, optional) – The feature names of the decision tree. If None, the features are assumed to be integers.

Returns:

feature_cut_points – A dictionary with the feature names as keys and the cut points as values.

Return type:

dict