random_forest_regressor#

Provides a model, which predicts next timesteps with a random forest regressor.

class Criterion(value)#

Bases: Enum

The function to measure the quality of a split.

absolute_error = 'absolute_error'#: mean absolute error for the mean absolute error, which minimizes the L1 loss using the median of each terminal node.

friedman_mse = 'friedman_mse'#: Mean squared error with Friedman’s improvement score, which uses mean squared error with Friedman’s improvement score for potential splits.

poisson = 'poisson'#: reduction in Poisson deviance.

squared_error = 'squared_error'#: Mean squared error, which is equal to variance reduction as feature selection criterion and minimizes the L2 loss using the mean of each terminal node.

class RandomForestRegressorConfig(name: str = 'Random Forest Regressor', normalize: bool = True, criterion: Criterion = Criterion.squared_error, n_estimators: int = 100, max_depth: Optional[int] = None, min_samples_split: int = 2, min_samples_leaf: int = 1, min_weight_fraction_leaf: float = 0.0)#

Bases: SkLearnModelConfig

Defines the configuration for the RandomForestRegressor.

name#

name of the model.

Type:: str

criterion#

the function to measure the quality of a split.

Type:: simba_ml.prediction.time_series.models.sk_learn.random_forest_regressor.Criterion

splitter#: the strategy used to choose the split at each node.

class RandomForestRegressorModel(time_series_params: TimeSeriesConfig, model_params: RandomForestRegressorConfig)#

Bases: SkLearnModel

Defines a model, which uses a Random Forest regressor.

Initializes the configuration for the RandomForestRegressor.

Parameters:

time_series_params – Time-series parameters that affect the training and architecture of models
model_params – configuration for the model.

get_model(model_params: RandomForestRegressorConfig) → BaseEstimator#

Returns the model.

Parameters:: model_params – configuration for the model.
Returns:: The model.

property name: str#

Returns the models name.

Returns:: The models name.

predict(data: ndarray[Any, dtype[float64]]) → ndarray[Any, dtype[float64]]#

Predicts the next time steps.

Parameters:: data – 3 dimensional numpy array.
Returns:: The predicted next time steps.

train(train: list[numpy.ndarray[Any, numpy.dtype[numpy.float64]]]) → None#

Trains the model with the train data flattened to two dimensions.

Parameters:: train – training data.

validate_prediction_input(data: ndarray[Any, dtype[float64]]) → None#

Validates the input of the predict function.

Parameters:: data – a single dataframe containing the input data, where the output will be predicted.
Raises:: ValueError – if data has incorrect shape (row length does not equal )

class Splitter(value)#

Bases: Enum

The strategy used to choose the split at each node.

best = 'best'#: Chooses always the best split.

random = 'random'#: Choose randomly from the distribution of the used criterion.