preprocessing#

Module with preprocessing funtionalities.

convert_dataframe_to_numpy(data: list[pandas.core.frame.DataFrame]) list[numpy.ndarray[Any, numpy.dtype[numpy.float64]]]#

Converts a list of dataframes to a list of numpy arrays.

Parameters:

data – list of dataframes.

Returns:

list of numpy arrays.

mix_data(*, observed_data: list[pandas.core.frame.DataFrame], synthetic_data: list[pandas.core.frame.DataFrame], ratio: float = 1) list[pandas.core.frame.DataFrame]#

Mixes real and synthetic data according to a ratio.

Parameters:
  • observed_data – observed data.

  • synthetic_data – synthetic data.

  • ratio – Ratio of synthethic to observed data.

Raises:

ValueError – If number of observed datapoints is not sufficient to fulfill ratio.

Returns:

The mixed data.

read_dataframes_from_csvs(path_to_csvs: str) list[pandas.core.frame.DataFrame]#

Reads all csv files in a given directory and returns a list of pd.Dataframes.

Constraint: defined folder must not be empty.

Parameters:

path_to_csvs – path to the directory containing the csv files.

Raises:
  • ValueError – if the given directory is empty.

  • ValueError – if the given directory contains no csvs.

Returns:

numpy array containing the data.

Return type:

data