preprocessing#
Module with preprocessing funtionalities.
- convert_dataframe_to_numpy(data: list[pandas.core.frame.DataFrame]) list[numpy.ndarray[Any, numpy.dtype[numpy.float64]]]#
Converts a list of dataframes to a list of numpy arrays.
- Parameters:
data – list of dataframes.
- Returns:
list of numpy arrays.
- mix_data(*, observed_data: list[pandas.core.frame.DataFrame], synthetic_data: list[pandas.core.frame.DataFrame], ratio: float = 1) list[pandas.core.frame.DataFrame]#
Mixes real and synthetic data according to a ratio.
- Parameters:
observed_data – observed data.
synthetic_data – synthetic data.
ratio – Ratio of synthethic to observed data.
- Raises:
ValueError – If number of observed datapoints is not sufficient to fulfill ratio.
- Returns:
The mixed data.
- read_dataframes_from_csvs(path_to_csvs: str) list[pandas.core.frame.DataFrame]#
Reads all csv files in a given directory and returns a list of pd.Dataframes.
Constraint: defined folder must not be empty.
- Parameters:
path_to_csvs – path to the directory containing the csv files.
- Raises:
ValueError – if the given directory is empty.
ValueError – if the given directory contains no csvs.
- Returns:
numpy array containing the data.
- Return type:
data