random_forest_regressor#

Provides a model, which predicts next timesteps with a random forest regressor.

class Criterion(value)#

Bases: Enum

The function to measure the quality of a split.

absolute_error = 'absolute_error'#

mean absolute error for the mean absolute error, which minimizes the L1 loss using the median of each terminal node.

friedman_mse = 'friedman_mse'#

Mean squared error with Friedman’s improvement score, which uses mean squared error with Friedman’s improvement score for potential splits.

poisson = 'poisson'#

reduction in Poisson deviance.

squared_error = 'squared_error'#

Mean squared error, which is equal to variance reduction as feature selection criterion and minimizes the L2 loss using the mean of each terminal node.

class RandomForestRegressorConfig(name: str = 'Random Forest Regressor', normalize: bool = True, criterion: Criterion = Criterion.squared_error, n_estimators: int = 100, max_depth: Optional[int] = None, min_samples_split: int = 2, min_samples_leaf: int = 1, min_weight_fraction_leaf: float = 0.0)#

Bases: SkLearnModelConfig

Defines the configuration for the RandomForestRegressor.

name#

name of the model.

Type:

str

criterion#

the function to measure the quality of a split.

Type:

simba_ml.prediction.time_series.models.sk_learn.random_forest_regressor.Criterion

splitter#

the strategy used to choose the split at each node.

class RandomForestRegressorModel(time_series_params: TimeSeriesConfig, model_params: RandomForestRegressorConfig)#

Bases: SkLearnModel

Defines a model, which uses a Random Forest regressor.

Initializes the configuration for the RandomForestRegressor.

Parameters:
  • time_series_params – Time-series parameters that affect the training and architecture of models

  • model_params – configuration for the model.

get_model(model_params: RandomForestRegressorConfig) BaseEstimator#

Returns the model.

Parameters:

model_params – configuration for the model.

Returns:

The model.

property name: str#

Returns the models name.

Returns:

The models name.

predict(data: ndarray[Any, dtype[float64]]) ndarray[Any, dtype[float64]]#

Predicts the next time steps.

Parameters:

data – 3 dimensional numpy array.

Returns:

The predicted next time steps.

train(train: list[numpy.ndarray[Any, numpy.dtype[numpy.float64]]]) None#

Trains the model with the train data flattened to two dimensions.

Parameters:

train – training data.

validate_prediction_input(data: ndarray[Any, dtype[float64]]) None#

Validates the input of the predict function.

Parameters:

data – a single dataframe containing the input data, where the output will be predicted.

Raises:

ValueError – if data has incorrect shape (row length does not equal )

class Splitter(value)#

Bases: Enum

The strategy used to choose the split at each node.

best = 'best'#

Chooses always the best split.

random = 'random'#

Choose randomly from the distribution of the used criterion.